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A  stochastic  process  is  represented  as  having  two 
components.   The  first  component  is  called  drift  and  measures 
location.   The  second  component,  called  noise,  measures  the 
variability  of  the  stochastic  process.   This  paper  is 
concerned  with  estimating  the  noise  process  when  the  noise 
process  is  assumed  to  follow  what  we  shall  call  a  generalized 
first  order  nonstationary  autoregressive  process.   The 
generalized  first  order  autoregressive  process  is  defined 
similar  to  the  first  order  autoregressive  process,  except  that 
the  parameter  relating  two  observations  is  different  for  each 
time,  point.   In  order  to  estimate  these  parameters  it  is 
necessary  that  the  stochastic  process  be  replicated  a 
sufficient  number  of  times. 

A  method  of  estimating  the  parameters  is  proposed  and 
the  broad  attendant  distribution  theory  is  delineated,  both 
in  a  general  setting  and  for  specific  situations.   The  prop- 


erties  of  these  estimators  are  given  and  some  tests  of 
hypothesis  concerning  the  parameters  are  investigated.   In 
order  to  comment  further  on  the  value  of  the  proposed 
estimators,  we  use  as  a  benchmark   the  maximum  likelihood 
estimators.   Their  properties  are  given  and  a  critical 
comparison  is  made  between  them  and  the  proposed  estimators. 

In  any  practical  situation  it  will  be  necessary  to 
decide  whether  or  not  the  first  order  generalized  autoregres- 
sive  process  is  sufficiently  accurate  to  describe  the  data. 
Therefore,  a  test  of  the  adequacy  of  the  model  is  given. 

Finally,  numerical  results  are  obtained  using  a 
computer  simulation.   The  proposed  estimators  and  the  maximum 
likelihood  estimators  are  compared.   Also  a  practical 
application  is  given. 


Chapter  I 
STATEMENT  OF  THE  PROBLEM 

1. 1   Introduction 

The  statistical  model  in  this  dissertation  is  a 

stochastic  process  {Y(t):t£T>.   Usually  T  will  denote  a 

time  interval  and  we  shall  suppose  that  replications  of  the 

process  can  be  monitored  during  T  at  times  trt<t, <t„< . . .<t  , 

0   12m 

a  typical  replicate  yielding  the  random  sample  y  .=Y  ( t . )  :Osj  sm. 
It  is  not  necessary  that  the  time  increments  t,  -tfl,  t?  -  t, , 
...,  t  -  t   -.  be  of  equal  length. 

If  we  write  y(t)=EY(t)  and  X  (t)  =Y  (t)  -\i  (t)  ,  then  we  may 
think  of  y(t)  as  the  "drift"  of  the  sample  paths  of  Y(t)  and 
X(t)  as  "noise".   Clearly  EX(t)=0.   Various  schemes  have, 
classically,  been  used  to  describe  the  noise  process.   In 
particular  one  may  assume  that,  with  y  .=u  (t . )  +x  .  :0^jsm, 

Xj  =  alXj-l+a2Xj-2+---+apXj-p+£j  :  PSJSm'         (1-1-D 
where  e    ,  £p+1»  •••/  £m  are  independent  identically  distri- 
buted random  variables. 

In  order  that  the  data  lend  themselves  to  analysis  under 
this  classical  model,  several  assumptions  must  be  made.   The 
most  restrictive  of  these  is  that  of  stationarity .   Expressed 
informally,  stationarity  assumes  that  the  process  has  been 
running  a  sufficiently  long  time  so  that  it  has  settled  down. 


Putting  this  into  a  probabilistic  context,  stationarity 

implies  that  the  probability  distribution  of  x   ,  x   ,  ..., 

1    C2 
x    is  the  same  as  the  probability  distribution  of  x    , , 

fck  1 

x.  ,,,  ...,    x      for  every  finite  set  of  values  (t, ,t~, 

.  .  .  ,  t,  )  and  for  everv  finite  t. 
k 

The  classical  analysis  of  the  model  of  equation  (1.1.1), 

known  as  the  p-th  order  autoregressive  model  is  likely  to  be 

inappropriate  in  many  cases  due  to  the  requirement  of 

stationarity.   For  example,  consider  observing  the  effect  of 

a  diet  on  weight  loss.   Initially  the  weight  loss  will  be 

greatest  and  will  tend  toward  zero  as  time  goes  on  and  the 

subject  tends  to  some  constant  weight.   Obviously  since  the 

larger  values  appear  first  the  probability  distribution  of 

the  initial  observations  is  not  the  same  as  that   occurring 

later.   A  second  example  is  the  effect  of  drug  infusion.   A 

patient  is  given  a  dose  of  some  drug,  either  orally  or 

intravenously,  and  blood  samples  are  drawn  at  various  times 

t^ ,  t, ,  ...,  t   thereafter.   The  amount  of  drug  in  the  blood 
0   1m 

is  then  measured  for  each  sample  at  each  time.   Again  the 
initial  readings  will  be  larger  than  the  later  ones  since 
the  drug  will  be  absorbed  into  the  system  or  discharged  as 
time  goes  on. 

It  may  be  argued  in  both  examples  cited,  that  successive 
differences  (or  perhaps  successive  second  differences)  have, 
approximately,  a  stationary  distribution.   Rather  than  con- 
cede to  ad  hoc  procedures  we  prefer  to  replace  the  stationary 
autoregressive  process  by  a  nonstationary  process.   We  gain 
this  generality  in  the  model  for  {Y(t)  :  teT}  at  the  expense 


of  requiring  (for  our  analysis  of,  and  estimation  of  the 
parameters  of  the  process)  several  replications  of  this 
process.   Fortunately,  in  many  instances  of  interest, 
replicates  will  be  available. 

The  simplest  alternative  to  the  p-th  order  autoregres- 
sive  scheme  is  what  we  shall  call  the  first  order  generalized 
autoregressive  scheme.   Formally  we  shall  assume  that  the 

errors  x,  ,  x_,  ...,  x   satisfy 

12      '   m       J 

Xj  =  aiXJ-l  +  £j  :  ls3sm'  (1.1.2) 

where  again  e. ,  e_,  ...,e   are  independent  identically 

distributed  random  variables.   It  will  be  assumed  that  the 

joint  distribution  of  the  errors  is  multivariate  normal. 

The  assumption  that  the  process  can  be  replicated  is  needed 

in  order  to  estimate  the  unknown  parameters  a,,  a~,  ...,  a  . 

r  12     '   m 

In  the  analysis  of  this  model  we  shall  be  concerned  with 
three  major  problems:   (1)  providing  estimators  for  the 
unknown  parameters,  (2)  finding  the  distribution  of  the 
estimators  and  comparing  them  to  other  estimators  (typically 
likelihood  estimators)  and,  (3)  providing  methodology  for 
the  testing  the  goodness  of  fit  of  the  model. 

1.2  Summary  of  Results 

A  method  of  estimating  the  parameters  is  proposed  in 
Chapter  2  and  the  broad  attendant  distribution  theory  is 
delineated,  both  in  a  general  setting  and  for  specific 
situations.   The  properties  of  these  estimators  are  provided 
in  Chapter  3.   In  Chapter  4  a  critical  comparison  is  made 


between  the  maximum  likelihood  estimators  and  the  estimators 
which  we  propose  as  an  alternative.   In  any  practical 
situation  it  will  be  necessary  to  decide  whether  or  not  the 
first  order  generalized  autoregressive  scheme  is 
sufficiently  accurate  to  describe  the  data.   A  decision 
procedure  bearing  on  this  aspect  is  investigated  in  Chapter  5. 

An  application  of  the  theory  is  given  in  Chapter  6, 
with  a  comparison  between  the  estimators  derived  in  this 
dissertation  and  the  usual  maximum  likelihood  estimators. 

1, 3   Notation 

In  almost  all  areas  of  statistics  notation  is  very 
important  -  consistant  notation  aids  in  solving  and  under- 
standing the  material  presented.   Also  it  is  very  convenient 
to  abbreviate  the  distributional  properties  of  random 
variables.   For  these  reasons,  certain  notational  conventions 
have  been  adopted.   Although  many  of  these  are  standard, 
they  will  be  listed  here  for  reference. 

1.  An  underscored  lower  case  letter  invariably 
represents  a  column  vector.   Its  row  dimension  will 
be  given  the  first  time  a  vector  appears.   Thus 

x:  (m)  denotes  a  column  vector  consisting  of  m  elements. 
The  vector  which  has  all  its  elements  zero  will  be 
denoted  by  0. 

2.  Matrices  will  be  denoted  by  capital  letters  and 
the  first  time  a  matrix  appears,  its  row  and  column 
dimensions  will  be  given.   Thus  M: (rxc)  denotes  a 


matrix  with  r  rows  and  c  columns.  Denote  the  zero  matrix 
by  (0).   The  symbol  "I"  will  be  reserved  for  the 
identity  matrix. 

3.   The  elements  of  a  matrix  will  be  denoted  by  the 

corresponding  small  letter  with  subscripts  to  denote 

their  row  and  column  position.   Thus  m. .  denotes  the 

ID 

element  in  the  i    rov;  and  j    column  of  the  matrix 
M.   The  symbol  (M)  .  .  is  equivalent  to  m . ..  and  is 
sometimes  substituted  for  convenience. 

•  4.   It  is  sometimes,  convenient  to  form  row  and  column 
vectors  from  a  given  matrix.   The  symbol  (M) .   will 
represent  the  row  vector  formed  from  the  i    row  of 
M.   Similarly  (M)  .  will  denote  the  column  vector 
formed  from  the  j    column  of  M. 

5.  The  matrix  formed  from  the  first  j  rows  and 

columns  of  M  will  be  denoted  by  M , . ,  .   M,.,  is 

7   (J)     (:) 

commonly  called  the  j    principal  submatrix. 

6.  The  Kronecher  product  of  M: (mxm)  and  N: (nxn)  will 
be  denoted  by  M  is  N  and  is  a  matrix  P: (mnxmn)  with 

P(k-l)s+i,  (£-l)s+j  =  mk^ij  ' 

7.  Diagonal  matrices  will  be  denoted  by  D=diag (d, ,d_ , 
...,d  ),  where  diag  is  short  for  diagonal  and  d.,d., 

. ..,d   are  the  elements  on  the  diagonal. 

8.  In  keeping  with  conventional  notation  we  shall 
write  etr(.)  to  denote  the  constant  "e"   (Euler's 
constant)  raised  to  the  tr ( . )  power. 


9.   In  distribution  theory  transformations  are  often 
made  use  of.   To  denote  the  Jacobian  of  the 
transformation  the  following  notation  J{X-*-Y}  will 
represent  the  Jacobian  of  the  transformation  from 
the  X-space  into  the  Y-space. 

10.  If  y  is  a  random  variable  having  a  normal  density, 

2 
with  mean  y,  and  variance  a  ,  we  will  write 

yvN(y,a2)  .  ■ 

11.  If  y  is  a  random  variable  defined  on  (o,"=)  with 
density 

f(y)  =  iT(hv)    2hiv~2)TlYV~1   exp  {-%y2}   , 
we  will  denote  this  by 

y^Xv(0)- 

This  is  to  be  read  as  "y  has  the  central  Chi  density 
on  v  degrees  of  freedom." 

12.  If  y  is  a  random  variable  defined  on  (o,°°)  with 
density 

f(y)  =  {r(%v)(2a2)^}_1  y^"1  e-y'2a2    , 
we  shall  abbreviate  this  by 
y  ^a2X2(0)  . 

13.  If  y_  is  an  m-dimensional  column  vector  whose 
elements  have  a  joint  normal  density  with  mean 
vector,  jj:  (m)  and  dispersion  matrix  V:  (mxm)  ,  this  will 
be  denoted  by 

y_^Nm(u,V)  . 


14.  If  y_,  ,  Zo '  •••/  Y   are-  mutually  independent 
m-variate  column  vectors  with 

y.^N  (y,V)   i=l, .  . .  ,n  . 

Then,    with   Y   =    (y_    ,    y_2 ,  .  .  .  ,    y    )  ,    we   will   write 

Y  %N  (M,    VhI)     , 

roxn      ' 

where   M  =    (y ,y, . . .  ,y)  . 

15.  With  Y  an  mxn  matrix  such  that 

Y^N    (M,  Veal) 
mxn 

the  mxm  matrix 

W=YY' 

has  a  noncentral  Wishart  distribution  with  dispersion  matrix 

V,  degrees  of  freedom  n,  and  noncentrality  matrix 

MM' .   We  will  write 

W  -vW  (V,n,i#T) 
m 

16.  If  W  is  an  mxm  symmetric  matrix  whose  'mCm+l)/^ 
mathematically  independent  elements  have  the  density 

f(W)  =  km(V,v)  !  W|^  (^-m-1)  etr{_j2V-lw} 

over  the  group  of  positive  definite  matrices,  where 
K_1(V,v)  =  2^mv7T1<m(m-l)  [v|%v   g  r(%(v.j  +  1)>> 

m  j  =  l 

we  will  write 

W^Wm(V,v,  (0))  . 

17.  For  referencing  within  the  text  [♦•]  will  denote 
bibliographical  references,  while  (')  will  denote 
references  to  equations.   Thus  [4]  refers  to  the 
fourth  entry  in  the  bibliography,  while  (1.2,3)  refers 
to  equation  3  in  section  2  of  Chapter  I . 


Chapter  II 

THE  DOOLITTLE  DECOMPOSITION  AND  ASSOCIATED 
DISTRIBUTION  THEORY 


2. 1   Introduction 

In  the  event  that  the  "noise"  process,  X (t) =Y (t) -y (t) , 
is  nonstationary ,  the  general  models  and  methods  of 
estimation,  like  those  given  by  Box  and  Jenkins  [5]  and 
Anderson  [2],  are  no  longer  valid.   We  ignore  the  cases 
where  the  nonstationarity  is  caused  by  trend,  since  this 
can  be  removed  and  the  resulting  series  is  stationary  and 
usual  methods  apply.   An  appropriate  model  for  a  "noise" 
process  with  this  irregular  behavior  is 

x.  -a  .x.  ,  +  e.  :  l;q^m  . 
Our  purpose  in  this  chapter  is  to  find  a  method  of 
estimating  the  parameters  of  this  model  and  to  establish 
the  attendent  distribution  theory. 

Throughout,  {y . : 0 sj sm} ,  will  denote  the  observed  values 

of  Y(t)  at  times  tn<t, <...<t  .   We  shall  assume  that  y.  is 
0   1m  2  3 

normally  distributed. 

2 • 2   V :  A  Class  of  Dispersion  Matrice s 

If  (X(t)  =  Y(t)  -  u(t):  teT}  is  a  nonstationary 
time  series,  whose  realizations,  x.,  satisfy  the  relationship 

x.  =  a_.x.._,  +  e.  :  lsjsm,  (2.2.1) 


then  {X(t)}  is  said  to  follow  the  first  order  generalized 
autoregressive  sequence. 

Suppose  we  arrive  at  the  (m+1) -variate  column  vector 
x  =  (xn ,x, , . . . ,x  )',  obtained  from  random  sampling  from 
the  above  process.   We  assume  that  x  is  an  observation  from 
the  (m+1) -variate  normal  distribution,  that  is, 


^Nm+l(!i'V)  ' 
where  \s_  =   £x  =  o,  since  8x(t)  =  o,  and 

V  =  Var  x  =  fixx1  . 

In  order  to  determine  V  we  assume  that  x  ,  e, , 

o'   1' 

are  uncorrelated  random  variables  with 

2„2 


(2.2.2) 


and 


Var(x  )  =  a 
o 


Var  ( e  . )  =  a   :  1  ^j  an  . 


(2.2.3) 


Since  the  process  follows  equation  (2.2.1)  we  can  write 
{xQ,xir  . . . ,xm>  in  terms  of  {xQ , e- , e_, . . . , £  }.   Applying 
equation  (2.2.1)  recursively  we  find  the  relationship  between 
Xj  and   x0,el"**'em  "   Letting  x:  (m+l)  =  (x0,x1,.  ..  ,xm)  '  , 
e: (m)  =  (e  ,e  ,...,e  )*  and  A: (m+1  x  m+1)  having  elements 

=  1  :  o^i^m 


n 


a. .  =  a .a .  , . 
i:     3    j-1 


a. 


akl  =  ° 


os3<ism 
elsewhere  , 

then  it  is  easily  verified  that 

x. 
x  =  A 

From  expression  (2.2.5)  we  see  that 

V  =  AUA'  , 

2  2   2   2        2 

where  U: (m+1  x  m+1)  =diag(a  B  , o    , a  ,  ...,a  ) 


(2.2.4) 


(2.2.5) 

(2.2.6) 
(2.2.7) 
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Writing  out  the  elements  of  V  explicitly  we  have 

v   =  a  3 
oo 

v..  =  a2  +  a2v.  ,  .  ,  :  1  sj  *n  (2.2.8) 

v.,  =  v,  .  =  a  •  , -,  a  ,~  ...  a,  v  .  .  :  lsj<ksjn  . 

In  the  density  of  x  we  need  A=V   ,  because  of  the 
form  of  V,  A  has  a  particularly  simple  form.   By  taking  the 
inverse  of  the  product  in  (2.2.6)  we  find 

A  =  (A-1)  'iT-'-A-1  .  '  (2.2.9) 

The  inverse  of  U  is  trivial  and  since  A  is  lower  triangular 
its  inverse  is  easily  shown  to  have  elements: 
a11  =  1         :  o£J  fin 


a 


(2.2.11) 


3+1'3    =   -a  :    osjsm-1  (2.2.10) 

j+1  J 

ik 
aJ   =  o         :  elsewhere  . 

Kence  the  elements  of  A  are  given  by 

2^      0-2  ,_   2     2,      , 

a   A =  6    +  a,  ;  a  A^  =  1 

oo  1      mm 

o2A..  =  1  +  a2+1  :  lsjsP»-l 

°2x3+l,j  "  "^j'j+l  =  "aj+l  :  °^^-1 

2 

a  A.,  =  o  :  elsewhere  . 

A  square  matrix  M,  with  m.  .  =0  :  |i-j|>l,  is  called  a 

"Jacobi  matrix"  in  the  literature.   This  matrix  can  be 

o 
factored  into  c  A  =  R'R,  where  R  is  lower  triangular  with 

r  =  e"1 

oo 

r  .  .  =  1      :  1  sj  sen 
33  (2.2.12) 

r .  .  „■  =  -a  .  :  1  sfj  sm 
D,D-1      3 

r.,  =  0       :  elsewhere 
This  result  can  be  obtained  from  (2.2.9)  very  easily. 
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One  further  property  of  V  is:  let  V. .  >  be  the 
(j+1  x  j+1)  principal  submatrix  formed  from  V,  then 

|V  .  |  -  a2(^  +  1)62  :  o^jsm  .  (2.2.13) 

This  result  can  be  obtained  by  partitioning  the 
matrices  in  (2.2.6)  and  taking  the  determinant  of  the 
corresponding  product  of  the  partitioned  matrices.   Since 
A  is  lower  triangular  with  unit  diagonal  elements  any 
square  partition  has  determinant  equal  to  unity.   The 
square  partition  of  U  is  diagonal  with  determinant  equal  to 
(2.2.13). 

We  note  that  the  special  form  of  V  and  its  properties 
are  due  to  the  model.   We  shall  let  V  denote  the  class  of 
matrices  with  this  special  form.   Specifically,  V  is  the 
class  of  all  positive  definite  matrices  V  such  that  V    is 
a  Jacobi  matrix.   To  see  that  VeV  implies  V    is  a  Jacobi 
matrix  we  note  that  V  may  always  be  represented  by  AUA' 
where  A  is  lower  triangular  given  by  (2.2.4).   Since  A 
is  given  by  (2.2.10)  then  the  resulting  product  (A   )'U  A 
is  always  a  Jacobi  matrix. 

2. 3   The  Dooiittle  Decomposition  and  Its  Jacobian 

Suppose  we  observe  a  number  of  (m+1) -variate  column 
vectors  y_ .  :  lsjs'n  (n>m+l)  ,  obtained  by  random  sampling  from 
an  (m+1) -variate  normal  population  with  mean  y  and  dispersion 
matrix  V.   It  is  well  known  that  the  maximum  likelihood 
estimates  of  y_  and  V  are  given  by 

o  =  £  =  ^    z    y.  (2.3.D 

n  j=i    j 


and 
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V  =  -  W  (2.3.2) 

n 


where  W  is  the  (m+1  x  m+1)  matrix 

n 


(2.3.3) 

.  ■--  i  --  ■--   i  -- 

y 

W  has  the  central  Wishart  distribution  with  v=n-l  degrees 


w  =  z   (y_,-y_) (Z,-Z) • 
1=1    3        J 


of  freedom  and  dispersion  matrix  V.   It  is  well  known  that 
ew  =  W,  so  that  v   W  is  an  unbiassed  estimate  of  V.   In 
later  sections  we  shall  assume  that  VeV  so  that  V  may  be 
written  as  V=AUA' ,  where  A  and  U  are  defined  in  section  2. 
We  note  that  the  sub-diagonal  of  A  is  (a. ,  a_ , . . . ,  a  )  with 
other  elements  being  products  of  the  a's.   Since  A  contains 

all  the  information  on  the  a's  and  U  contains  information 

2       2 

on  o   and  6  ,  if  we  could  estimate  these  matrices  we  would 

have  estimates  of  the  unknown  parameters.   Since  W  estimates 
V,  perhaps  a  transformation  on  the  element  of  W  will  give 
us  estimates  of  A  and  U.   It  is  with  this  intuitive  notion 
in  mind  that  we  proceed.   We  assume  that  a  matrix  W  is 
available  with  v  degrees  of  freedom,  and  for  the  moment, 
that  V  is  an  arbitrary  positive  definite  matrix. 

For  convenience  we  label  the  rows  and  columns  of  W 
zero  through  m  (rather  than  1  through  m+1)  and  let  W  ,  . ,  be 
the  (j+1  x  j+1)  principal  submatrix  of  W.   Define 

di  "  lw(j)!-lw(j-i)l"    !  lsjsm    • 

With  D: (m+1  x  m+1)  =  diag (d  ,d, , . . . ,d  )  define   G:(m+1  x 
m+1) ,  a  lower  triangular  matrix  with  unit  diagonal  elements 
(uniquely)  by 
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W  =  GDG'  .  (2.3.5) 

Since  the  (m+1)  random  diagonal  elements  of  D  and  the 
^mCm+l)  random  elements  {g. .:  Oij<ism}  of  G  combine  to 
give  ^(m+2)  (m+1)  random  variables  we  see  that  the  trans- 
formation from  W  into  G  and  D  is  nonsingular.   The  actual 
decomposition  of  W  into  G  and  D  can  be  obtained   using  the 
forward  Doolittle  procedure  outlined  in  Rao  [14]  and  Saw  [16]  . 

We  wish  to  determine  the  joint  density  of  G  and  D. 
This  can  be  obtained  by  using  the  density  of  W,  denoted  by 
f  (ft)  ,  and  obtaining  the  Jacobian  of  the  transformation  from 
M  into  G  and  D  defined  by  (2.3.5).   Denoting  this  Jacobian 
by  J{W-»-G,D},  and  the  joint  density  on  G  and  D  by  h(G,D)  then 

h(G,D)  =  f  (GDG'  )  J{W-h3,D}  .  (2.3.6) 

Direct  evaluation  of  the  Jacobian  is  cumbersome  and  the 
method  used  here  is  due  to  Hsu  as  reported  by  Deemer  and 
Olkin  [7] .   Since  the  derivation  of  the  Jacobian  is  rather 
long  the  rest  of  this  section  is  devoted  to  it. 

We  seek  the  Jacobian  of  the  transformation  from  W  to 
G  and  D  defined  by 

W  =  GDG ' ,  (2.3.7) 

with  all  matrices  (m+1  x  m+1)  and  G  lower  triangular  with 
unit   diagonals.   Let  [SG]  and  [6D] ,  both  (m+1  x  m+1), 
denote  small  changes  in  G  and  D,  respectively.   Suppose  that 
the  changes  [5G]  in  G  and  [6D]  in  D  bring  about  a  change 
[6W]  in  W  so  that  (2.3.5)  is  preserved.   That  is 

W+[6W]  =  (G+[5G])  (D+[5DJ)  (G+[66]J  '  .  (2.3.8) 

Expanding  equation  (2.3.8)  and  dropping  terms  of  second 
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order  in  the  [6* ] [*e (G, D) ] ,  we  find  that 

W+[6W]  =  GDG*  +  [6G]DG'  +  G[6D]G'  +  GD[6G]*  .     (2.3.9) 
Since  W  =  GDG',  we  see  that 

[5W]  =  [5G]DG'  +  G[6D]G'  +  GD[6G]'  .  (2.3.10) 

Hsu  has  shown  that 

J{W+G,D}   =  J{  [<5W]-*[6G]  ,  [6D]  },  (2.3.11) 

where  J{  [6W]  -*■  [6G]  ,  [5D]  }  is  the  Jacobian  of  the  transforma- 
tion  defined  by  (2.3.10),  in  which  G  and  D  are  considered 
to  be  fixed  (m+1  x  m+1)  matrices.   In  essence  we  have  gone 
from  a  non-linear  transformation  in  G  and  D  into  a  linear 
transformation  in  the  differential  elements  [<5G]  and  [5D]  . 

Pre  and  post  multiplying  (2.3.10)  by  G-1  and  (G')"1, 
respectively  gives 

G_1[5W] (G')"1  =  G-1[SG]D  +  [6D]  +  DtfiGJ'tG*)"1  .    (2.3.12) 

Let  A  =  G-1[6W] (G')"1  , 

B  =  G_1[6G]  ,  (2.3.13) 

and  C  =  [6D]  . 

We  note  that  A  is  symmetric,  B  is  lower  triangular  with 
(B)  ..  .  =  0:0Si^m  and  C  is  diagonal.   We  may  rewrite  (2.3.12) 
as 

A  =  BD  +  C  +  DB'   .  (2.3.14) 

From  equations  (2.3.10),  (2.3.12),  (2.3.13),  and  (2.3.14)  we 
have 

J{W^G,D}  =  J{ [6W]+[5G] ,  [6D] } 

=  J{[6W]^A}  •  J{A-^B,C}  -  J{B,C->[6G]  ,  [5D]  }.        (2.3.15) 
We  shall  evaluate  the  last  three  Jacobians  separately. 

The  Jacobian,  J{[5W]->-A},  is  the  Jacobian  of  the  first 
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transformation  defined  by  (2.3-13).   This  can  be  evaluated 
by  usual  methods  and  we  find 

J{[5W]+A}  =  |G|(m+2)  =  1   ,  (2.3.16) 

since  G  is  lower  triangular  with  unit  diagonals. 

The  Jacobian,  J{B,  C->  [6G]  ,  [  5D]  }  ,  is  the  Jacobian  of 
the  transformation  defined  by  the  last  two  equations  in 
(2.3.13).  Hence  it  may  be  factored  into  the  product  of 
two  Jacobians,  namely, 

J{BfO[6G],  [<5D]}  =  J{B+[5G] }-J{C+[6D] }.        (2.3.17) 
By  the  usual  methods  for  determining  Jacobians  we  find 

J{B-[6G]}  =  |G_1|(m+1)  =  1  ,  (2.3.18) 

and      J{C+[6D]}  =  ^[(m+1)  =  ±  (2.3.19) 

so  that  equation  (2.3.17)  is  unity. 

Finally  we  need  to  determine  J{A->-B,C}.   Writing  out 
the  equations  given  by  (2.3.14)  and  using  the  fact  that  B 
is  lower  triangular  with  zero  diagonal  elements  we  find 


a..  =a..  =b..d. 
Di     ID     1J  3 


and 

Hence  we  find  that 


a  .  .    =   c  .  ■ 
33  33 


0  ^  j  <  i  *m 


(2.3.20) 


3a.  . 

33    =   i 
3c.  .         x 

33 


and 


3a 


13    _ 


3b. 
13 


=   d 


osj  an 


os3<ism    , 


(2.3.21) 


so  that  the  Jacobian  is 
J{A+B,C}= 


3(aooal0a20* 


amOalla21 


a  .,  .  .  .a   ) 
ml    mm 


3 (c   b, nbon 
oo  10  20 


,b   c,  ,b0,  . .  .b  n  . 

mo  11  21    ml 
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m    ™  A 

=   7[  d   3   .  (2.3.22) 

Following  equation  (2.3.15)  we  obtain 

J{W+G,D}  =   ™  dm_j  .  (2.3.23) 

j=o  J 

2.4   The  Joint  Distribution  of  G  and  D  for  Arbitrary  V  and 

in  Particular  When  VsV. 

In  section  3  we  derived  the  Jacobian  of  the  non-linear 
transformation  W  =   GDG' .    We  now  suppose  that,  for 
arbitrary  positive  definite  V, 

WOU  ..  (V,v,  (0))  .  (2.4.1) 

m+1 

The    joint   density   on   G   and   D   is    then   obtained    from    (2.4.1), 
(2.3.6)  ,    and    (2.3.23) . 

h(G,D)    =   K   +1(V,v)etr{-W_1GDG'}   5  d¥v+m)-j-1  (2.4.2) 

m+1  j=0   3 


d  .  >o    :    ogj^m 
3 

and   -<»<g.  .<°°    :    osj<ism    . 

The   terra  K    ,, (V,v)    is    defined   by 

K-l    (VfV)    =    |v|*sv(2)Wm+l)n%m(m+l)    J}   r(^(v-j)).  (2.4.3) 

m+1  j_q 

With      A=V        we  may  write, 

tr{-J5V"1GDG' }    =    tr{-^AGDG'} 

=    tr{-JsDG'AG}  (2.4.4) 

=   -h    i      d.  (G'AG)  .  .     • 
j=0      3  33 

Kence  we  see  that  the  density  in  (2.4.2)  partitions  into 


};  • 


the  subsets  {d0,g10 ,g2Q , . . - ,gm0> ;  ^d1»<32l,g31' *  *  * 'gml 

{d   , ,  a  ,  };  Id  }  which  are  mutually  independent,  but 

m-1   ^m,  m-1    '  m 

variables  within  a  subset  are  dependent. 


17 

2 

In  the  case  that  VeV  then  we  may  write  a  A=  R'R  from 

equation  (2.2.12)  and  we  find 

(G'AG) ..  =  -0    (G'R'RG)  .  . 


,  (2.4.5) 

i   (RG) '   (RG)    , 

rr  J        J 


where 


(RG);o  =  (B"1;g10-a1;g20-a2g10;...;gmo-amgmfm_1) 
and  for  i  sj  ^n-1 

(RG);.  =  (0;0;...;l;gj+1/j-aj+1;gj+2fj-aj+2gj+lfj;...;- 

^j-Vin-l,]1  '  (2-4-6) 

The  "1"  appears  as  the  j    element  in  (RG)  .  . 
Now  we  may  write 

<G*AG>oo  =  VS    +  vZn(gko-  Vh,o'} 
a         k=l 

and  for  l^jsm-1  (2.4.7) 


1         m  2 

(G'AG)  .  .  -  i   {1  +    Z       (g    -a,g   ,  .)*} 

33        a       k=i+l  k'3      K  K      ,3 


The  density  h(G,D)  factors  into 

m-1 
h(G,D)  =  t.^h.Cd.^g^^^g.^^ gmj)}hm(dm),   (2.4.3) 

where 

ho(d0'g10'g20 gmo)  = 


d 


Mv+m)-l  _  f   ,,  ro-2   m 


etr{-^do[B  ^^(gko-Vk-^o)  H 
r(W^(v+m)  ^m  |v|*  |v(0)|35(v-1)  (2-4*9) 


and  for  lijsn-1 
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h  .  (d . ,g  .  , ,     .  , . . . ,g      )    = 
3      ]      3+1.]  '^nrj1 

m 


rd,(v-j})2^(v+ffi)-^-(m-3)  |v^|v(j)  i^v-j-u  l-v^J3*1^   ' 


(2.4.10) 


-  v-m 


and  h    (d    )    =    {T  {h  (v-m))(2a2)      2    }        d^(v   m)  ~1exp[-}sa    2d    ]. 

mm  "  m         i  -  •     mj 


-1   i, 

'2 

m 


(2.4.11) 

2.5   The  Distribution  of  G,  of  D,  and  of  G  Conditional  on 
D  When  V  Is  Arbitrary  and  When  VcV  . 

The  necessity  of  knowing  the  distribution  of  d0;d.;...; 

dm  and  the  subsets  {g10;g20 ; - . - ;gmo>;  {g21;g31; . . . ;gml}; . . . ; 

{g   ,   }  arises  from  the  fact  that  functions  of  these  statis- 
Jm~ 1 e m 

2    2 
tics  will  be  estimators  for  the  parameters  a    ,    (3  »  and 

{a,,a„,...,a  }.   A  knowledge  cf  the  distribution  of  the 

estimators  gives  us  the  information  we  need  to  talk  about 

the  "goodness"  of  the  estimators.   It  is  to  this  end  that 

we  derive  the  distribution  of  G,  D,  and  G  conditional  on  D. 

The  distribution  on  the  elements  of  D  follow  directly 
from  a  theorem  given  in  class  lecture  notes  and  in  Saw  [15] . 
Theorem  2.5.1 

If  W"jb   ,  (V,v,  (0))  ,  v  integer   with  v>m;  and  W.  .  is 

defined  by 

w 


W 


(r) 


w   w 

00   01 

w   w 

10   11 


'r0  wrl 


Or 


Ir 


Osrsm 
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Then  -C  |  W    |  ;  {j  W    j  /  J  W  ,   1}  |  }™}  are  independent  chi- 
square  variates  such  that 

lW(0)^lV(0)l'<v  (o)  <2'5-2> 

and  for  1  scc^m 

lW(r)l/lW(r-l)l^V(r)l/lV(r-l)l  *v-r  <o)  '       <2-5-3> 
Since  from  (2.3.4)  we  have 

'  do  "  lw(0)! 
dj  =  |w(j)  l/iw(j-i)  l^'1^^  , 

then  by  direct  application  of  Theorem  2.5.1  we  have 

d  Mv,„.  |  x2  (o) 
o  '  (0)  *    *v 


and  for  1  <rj  sn 


dj"|V(j)i/|V(j-l)'  Xv-j  (o)  '  (2'5'4) 


Now  if  we  allow  VeV,  then  since  |v...|  =  a2^"1"1^2  we 


find 


dQ^a232X2    (o) 


and   for   1  £  j  sm  (2.5.5) 

2    2 

d .^a  x      ■    (o) 

To  find  the  density  on  the  subsets  {g, n;q~n; . . . ;q  „ } ; 

^10  ^20      ^m0 

^21;g31;  "  *  "  ;gml^;  "  *  '  '  ^gm  m-"1  ^  we  refer  back  to  equations 
(2.4.9)  and  (2.4.10).   Using  those  equations  we  may  write 
for  o  sj  sfii- 1  / 
VdJ'Vl,J;'--;gm,j)=C  ^^'W^d.tG'AG)^},  (2.5.6) 

for  some  constant  C.   Performing  the  integration  over  d.  and 

mm  ^ 

replacing  (G'AG)  ..  by   I    E  X.»g.  .gfl  .  we  have, 

3  3      k=j  £=j  k*  kj  Jij 


20 

mm  j 

hj(gj+l,j;--';gm:)    -Cm<V!V'3)^.     /A^kjV  '' 

K_J     *_:1  (2.5.7) 

where,    with   V,    . .     taken   as    unity 

rCs(v+m)-j)  | v..  ..  |^(v~^ 

C  (v:V,j)  = j^— ^ ipj ^    ■     .  (2.5.8) 

m        r(v2(V-j))n'2(m  -1}  |v|2|vr)  |2(v  D  1} 

Remembering  that  g..-l,  a  transformation  shows  the  variables 
in  (2.5.7)  have  a  multivariate  t-distribution.   The  form  and 
properties  of  the  multivariate  t-distribution  were  found  by 
Cornish  [6]  in  1954  and  also  in  the  same  year  by  Dunnett  and 
Sobel   [8]. 

Now  if  we  allow  VeV,  we  find 
h0(g10;g20;...;gm0)=6  cm(v:i,0)  {1+8   Z    (<3kQ-Vk_lfQ)2} 

K  —  x 

(2.5.9) 
and  for  l£j<_m-l 

m  . 

K-DTl  (2.5.10) 

Evidently  910 ' ?21 ' ' ' * ' gm  m-1  are  mutually  independent  and 

vS(g10-ai)  ^tv(0)  (2.5.11) 

and 

,v-k+1)^k,k-rakhVk+i  :  2-k^  ■  (2-5-12) 

To  find  the  distribution  of  G  conditional  on  D  let 

|v._,.  |  =  1,  then  the  marginal  distribution  on  D  is 

h(D)=  h(dn)  -h(d,)  .  .  .h(dm) 
U      1       m 

m   d^-^W-^  |V  (j)|_1|V      |d  } 

J=o  (2|v(j)  |  |v(j_1)  |  -1)  2(v  D)r(v2(v-j)) 

We  have  from  equations  (2.4.2)  and  (2.4.4)  that  the  joint 
density  on  (G,D)  is 


(G,D)=k    (V,v)  n  {d^(V+m)  j  ^xpC-Jgd.  (G'AG)  .}} 


h(G, 

m+i  '  '  ' 

y 

The  conditional  distribution  of  G  qiven  D  is 
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Although  this  does  not  have  a  very  pleasing  form  if  we  let 
VeV,  we  find  the  conditional  density  simplifies  greatly.   If 
we  write  G'G  =(RG)'(RG)  and  use  the  fact  that  iv,.,|=  a2^+1^6 

we  find 

m-1     _   ,  -^(m-j)       _0  m  _ 

h(G|D)=  n  {(27Taid.x)       expt-^a  'd.   E   (g.  .-a.g.  .  .)  ]}. 


D=o       •  -k.j+l  ^  ^"^ 


fl-JTi. 

(2.5.15) 


Let  g.  be  the  (m- j) -variate  column  vector  given  by 

'    h    =  (gJ+l,J;gJ-H2/j?---^mj)  '  :  0*j*«n-l,         (2.5.16) 

u_.  the  (m-j)  dimensional  column  vector  defined  by 

Hj  =  (ctj+l'aj  +  laj+2;-";aj  +  laj  +  2'--am),:o£^ra-1'    (2-5-17) 

and  V.  the  (m-j  x  m-j)  matrix  whose  elements  are  given  by 
2 

(v.)      *  SL 

v y  11     d.      :    Osijsm-1    , 

2  , 

^jJfck*   dT  +   Vk(Vk-U-l    :    2^^-j;    o*j*m-l,  (2.5.18) 

and       (Vj)kl=(Vj)lk=aj+kaj+k+1...aj  +  1(V.)kk:    lsk<l*m-j; 

osjsm-1 . 

Then   equation    (2.5.15)    may   be  written   as 

m-1 
h(G|D)     ■=      n      h.(g.|d.)     ,  (2.5.19) 

j=o      -1      J      -* 

where    for   osjsm-1 

h^(q.\d.)    =   (2*)"%(m~j>  IVjf^expt-istaj-yjJ'v"1^ j-Hj)]. 
That   is  (2.5.20) 

2j|dj'uNn_j(y.,V.)     :    osj^m-l    .  (2.5.21) 
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2.6   Verification  of  the  Distribution  of  G 


In  finding  the  density  of  the  subset  (g,  ,  ,  ; . . . ; g  ,) 

the  constant  C  (v:V,k)  was  aiven,  but  was  not  verified.   In 
rn 

this  section  we  show  how  the  value  of  the  constant  can  be 

obtained. 

We  need  to  evaluate  the  integral 

.  „  P    P 
I         I 
i=0  j=0 


I-  «  |(  I         Z  a-.u.u.)  3du,du. 


ll""2' 


.du 


(2.6.1) 


{-°°<u.  <+oo:  lsisp} 


where  u  =1  and  a .   . =a . . 
o        13   31 


We  may  write 


P    P 

I    E  a..u.u-  =  (1  u')/a  a ' 

i=0  j=0  13  X    D      -  /  00  - 

\  a  A 


(2.6.2) 


where  _a  -  (aQV    aQ2 aQp)  • 


(2.6.3) 


and 


A  = 


all   °12 


a21   a22 


a  ,   a  - 
pi    p2 


'IP 


'2P 


PP 


(2.6.4) 


Now  equation  (2.6.2)  may  be  rewritten  as 


P    P 

I    Z      a. -u.u.  =ann+u' Au+2u' a 
i-0  j-0  ^    X    3    00  


(2.6.5) 


=a   +(u-f-A_1a)  'Atu+A'^l-a'Ati.  (2.6.6) 


-1  -1  h 

Write  u  +  A   a  =  (ann~a'A   a)  2Kw 


.  (2.6.7) 
where  KAK'-I,  so  that  [k|  =  IaI-'2.   The  Jacobian  from  u 
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into  w  can  be  found  by  standard  methods  and  is 
J{u-»w}  -  (a00-a,A~1a)1'£p|K| 

=  (aQQ-a'A^cO^lAf^. 
Hence  we  have 


I   =  (ann-a'A  xa)  2P  p|a|  2/(1+w'w)  p  dw....dw„  . 

g       00        —       '   '        —  —        1       p 

{-CO<W  ,  <CO;  lsisp  } 

We  compare  this  with  the  following  version  of  the  multi- 
variate t-distribution 

,   Isjsp 


(2.6.8) 
(2.6.9) 

(2.6.10) 


f(t)=C  |2rt(l+(t-e)'2:  1(t-6)]  I'2(n+P)  :— <tj<-,  n>o 


where  C   =  r  (h  (n+p)  )  /{^2PT  (hn)  }  . 
P 

Let  T=I  and  6_=c  ,  then  setting  B=^(n+p)  we  see  that 
n=2B-p,  since  n>o  we  must  have  B>hp   and  we  find 

/(l+w'w)-pdw1...dwp  =  U/{t)?U2 


(2.6.11) 


(2.6.12) 


(2.6.13) 


{-°°<w.  <«>:l<i<p} 


Hence , 

I 


hP 


g    (*oo-^A  ">  2P   A   2   r  <B) 


a   afP'0 

00  ~  Vd 

a    A1    r(3-1-2P)Tr  2P 

|A|%<P+D-Br(B) 

Referring  back  to  (2.5.8)  we  see  that  p=m-j; 
so  that 


(2.6.14) 


(2.6.15) 


^(v+m)-j, 
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r(3s(v+m)-j) 


^(v-j) 


Cm(V:V'J)  =  rrt(v-j»)^<»-3)|AJ  ^"^ 


(2.6.16) 


where 


and 


Al  = 


3r] 


'J/  j+1 


Xj+l,j     Aj+l,j+l 


m,j      m,j+l 


3  ,m 


j+l,m 


m,m 


(2.6.17) 


A2  " 


Xj+l,j+l   Xj  +  l,j- 


Xj+2,j+l   Xj+2,j+2 


m,j+l      m,j+2 


j+l,m 


'  j+2 ,m 


m,m 


(2.6.18) 


T-l 


Now  with  A=V   ,  then  by  an  application  of  the  clockwise  rule 


IVI"1  =  |A|. 


hence  we  find 


All  A12 
A21  A22 


-1 

=  |a22!|ai:l-a12a22a21 


=  U22llvii 


So   that 


A22iHvr1iv11 


Aii-ivr1-  iv(j_1}i 
a2i  =  ivr1  -  |V(j)i  , 


and  we  confirm  that 


r(^(v-j))^%(m-j)  |v|^|v 


h(v-j) 


(2.6.19) 

(2.6.20) 

(2.6.21) 
(2.6.22) 


(J) 


(v-j-1) 


Chapter  III 
THE  ESTIMATORS  o\,     B*,  and  {a*.:lsjsm} 

3. I   Introduction 

In  Chapter  II  the  unknown  parameters  a    ,    B2,  and 
{a .  :isj%0  were  introduced  to  define  the  generalized  first 
order  autoregressive  process  and  thence  the  underlying 
distribution  of  the  observations.   In  this  chapter  we 
propose  estimators  (alternatives  to  the  maximum  likelihood 
estimators)  for  these  parameters  and  discuss  their  properties. 
(To  distinguish  between  the  estimators  proposed  in  Chapter  II 
and  the  maximum  likelihood  estimator,  we  shall  reserve  the 
hat  (A)  notation  for  the  latter  and  star  (*)  notation  for 
the  former.) 

Finally,  tests  of  hypothesis  concerning  the  parameters 
are  discussed.   Special  attention  is  given  to  the  case  of 
testing  ak  =  ak+1  =  . . .  =  am  =  nQ,  for  k  =  2,3, . . . ,m.   A 
method,  due  to  Fisher,  of  combining  independent  tests  is 
likely  to  be  appropriate. 

3  •  2   The  Distribution  and  Properties  of  g2,  B2,  and  {a*  ■:  l,sj*m .  } 

Suppose  that  we  observe  a  number  of  (m+1) -variate 
column  vectors  v_ .  :  1  s j  sn  (n>m+l)  ,  obtained  by  random  sampling 
from  an  (m+1) -variate  normal  population  with  mean  y  and 
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dispersion  matrix  V.   We  assume  that  these  observations  come  from 

the  process  Y(t)  =  p  (t)  +  X(t),  during  times  t  <t,<...<t   , 

o  i  m 

where  X(t)  follows  the  first  order  generalized  autoregressive 

process.   Since?  Y(t)  =  p (t)  we  have,  letting  U(t.)  =  u.: 

osism,  that 

1   n 

or  alternatively  with  the  (m+1)  dimensional  column  vector  p= 
(y0,  uv    ..-,  U  )\  that 


&i  =  YL   *  ~  ,lJij    :  o^i^m,  (3.2.1) 


H  =  X  =  n   Z  ili  -  (3.2.2) 

n  j=l  D 

Estimates  of  the  noise  process  X(t)  for  t„<t,<...<t   are 

0   1      m 

*j  =  Zj  "  £  :  ls^n  •  (3.2.3) 

From  these  we  may  arrive  at 


n 
W  =   I   x.x. 
j-1  ~« 

n 


(3.2.4) 


1     (Zi~£)  (Z-;~£)  '  .  (3.2.5) 

-1  3  3 


=  Z 

r- 

where  W  has  the  central  Wishart  distribution  on   v=n-l 
degrees  of  freedom  and  dispersion  matrix  V  (where  VeV) . 
Hence  the  theory  of  Chapter  II  is  applicable  and  we  have 
from  equation  (2.5.5),  (2.5.11),  and  (2.5.12)  that 

d0^°2^2^l     CO)  '  (3.2.6) 

dj^o  xv_j  (0)  :  lsjsan  ,  (3.2.7) 

v'5B(g1()-a1)  *  tv(0)  ,  (3.2.8) 

and       (V-k+1)1^  (gk/k_1~«k)^tv_k+1(0)  :    2  sksm      .  (3.2.9) 

Hence   the   estimators    for    {ex.:    lsjsm}    are: 

a*j    =   gj/j"l    :    lsjsm'  (3.2.10) 
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with  densities  given  in  equations  (3.2.8)  and  (3.2.9). 

2        7 

The  estimators  for  a      and  3   are: 


.2    1 


m 


°*  =  ic\  .Z  dS  '  (3.2.11) 

1  3=1  J 

where  ^  =  V-Js(n+1)  (3.2.12) 

7  m      -1 

aRd  S*  =  d0(c2  E  di>  (3.2.13) 

j  =  l  J 

where  c2  =  vOnc^)"1  .  (3.2.14) 

We  note  that  a*  has  a  Gamma  density  and  g2  has  a  Beta  Type 
2  density. 

In  order  to  evaluate  the  "goodness"  of  these  estimators 
the  following  properties  are  investigated:   (1)  the  first  two 
moments,  (2)  the  consistency  of  the  estimators,  and  (3)  the 
efficiency  of  the  estimators. 

The  first  moment  of  the  estimators  are  given  by 

e(ct*j)  =  aj  :  1^^m  >  (3.2.15) 

£(a*)  =  °2    >  (3.2.16) 

and      cdjj,  „  62  f  (3.2.17) 

so  that  the  estimators  are  all  unbiassed. 

Letting  £*  denote  the  (m+2  x  m+2)  variance-covariance 
matrix  of  the  (m+2)  dimensional  vector  of  unbiassed  estimators 
(a*l'  a*2'  **•'  a*m'  °*'    6*J '  we  find 

(E*>U  =  6   (v-2)~  ,  (3.2.18) 

(^*)j:j  =   (v-j  +  1)"1  :  2sjim  ,  (3.2.19) 

U*)m+l,m+l  =  2^mc±)~    o4  ,  (3.2.20) 

(Z*)m+2,m+2  =  2  (v+racr2)  tv  (m^-4)  ]  "V  ,  (3.2.21) 

(E*)m+l,m+2  =  (Z*)m+2,m+l  =  "2  (mc^  _1a2B2 ,  (3.2.22) 

and     ^E*^i-j  =  °  :  elsewhere  .  (3.2.23) 
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In  Fisz  [10]  and  Feller  [9]  it  is  shown  that  a  statistic 

t  ,  based  on  v  observations,  with  mean  6  and  variance  t  ,  will  b< 
consistent  for  3  if 


lira  t   =  0 
v 

\)-yoo 


(3.2.24) 


In  the  equations  (3.2.18)  thru  (3.2.21)  if  we  let  v+°°, 

we  find  the  variance  of  the  estimator  goes  to  zero.   Hence 

the  estimators  (a 


2    2 
'  a*m'  a*'  3*)  a^e  consistent 


The  likelihood  function,  from  which  the  estimators 
are  derived,  is  the  density  of  W,  given  by 

■   L  -  f(W)  =  Km+1(V.,v)|w|J§(v-m-2)etr{-35v""1W}  .    (3.2.25) 
Using  this  we  may  obtain  the  "Information  Matrix",  F, 
whose  elements  are  defined  by 


(F)j£ 


=  I 


9  log  L 

36.  36  ; 
1  * 


(3.2.26) 


where  6.  and  6   are  any  two  of  the  parameters  a.,...,a_, 

on  1         Et 

a  ,  and  3  .   The  j    diagonal  element  of  F    gives  the 
minimum  varinace  bound  of  any  estimator  of  9 . .   Letting 

(0l'e2'     •••'em'6m+l'em+2)     E    <Va2'     '  '  "  am'  ^  '  ^    we    find 
(F)33    "   C<t2Wj-l.J-l} 


a2    3-lrD-l 


l^j^m    , 


With  Vl'J-l=a2(1+aj-l+aj-laJ-2+'--+aj-laj-2- 

2         2  2    2    2 

+0tJ-laj-2-  *  -aoa-,  S    )  , 


(F)m+l,m+l    ~  e  [ 


2"1; 
-v  (m+1) 


2a 
v(m-H) 
2a4 


+   -.trV_1W] 


(3.2.27) 


(3.2.28) 


(3.2.29) 
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r        w    i 

(V\  -  p     "V     -P-P  ' 

m+2'm+2     hs4    a2^ 


2S 


4   ' 


(F)m+l,m+2    (I)m+2,m+l 


2a  8 


(3.2.30) 


and 


o  202  ' 

2a  B 


(F) -£  =  0: elsewhere 


To  see  that  (F) ._  =  0  elsewhere  we  note  that 

32lo   L 

"56 96~  =  °   for  l^i1^''    ls^sm;  j  f  1  , 

J    A 


3   log  L    _  _1_ 


and 


9  6.  86 
]    m+1 


9   log  L  _  A    .  . 

96.  36 „  "  °   :  ls3^ 

]    m+2 


(3.2.31) 
(3.2.32) 

(3.2.33) 


4  'Yj-lO-r^-^j'  :  1SJSIU  '    (3.2.34) 


(3-2.35) 


now  e(  -L  [YH,H-wH(ji)  -  ?(Vj-i'J-i  "  v3-i,j> 


=  o  , 


(3.2.36) 


since  v 


j_lf  j~ajvj-l,  j-i*  Hence  (F)-;j>,  is  zero  as  was  to  be 


shown. 


,-1 


With  Z  -  F  "  we  find  that  the  minimum  variance  bounds 


are 


-1-2 


(S)JJ  =  (V  Vj-l,j-l)   a   :  1SJSm  ' 


(2) 


(Z) 


(E) 


-1_4 


m+1 


,m+l  =  2(vm)   a   ' 


-104 


m+ 


2,m+2  =  2(m+l)(vm)  *B*  , 


m+1 ,m+2 


ii) 


-1_2„2 


ra+ 


2, m+1  =  "2(Vm)   a  3  ' 


(3.2.37) 

(3.2.38) 

(3.2.39) 
(3.2.40) 
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and       (£)  .,  =  0   :  elsewhere  .  (3.2.41) 

Since  the  efficiency  of  an  estimator  is  the  ratio  of  the 
minimum  variance  bound  to  the  variance  of  the  estimator, 
we  have  from  equations  (3.2.18)  thru  (3.2.23)  and  equations 
(3.2.37) thru  (3.2.41)  that  the  efficiency  of  the  starred 
estimator,  denoted  by  Eff(6A),  is 

Eff(a#1)  =  (v-2)v_1  ,  (3.2.42) 

Eff(a^)  =  (v-j-1)  (vv     ^_1)"1a2:  2sj  sn  ,      (3.2.43) 

2        -1 
Eff(a^)  =  c2v    ,  (3.2.44) 

and    Eff(g^)  =  (m+1)  v  (1x10.-4)  [vm  ( v+mc  -2)  ]  _1  .        (3.2.45) 

2        2 

The  estimators  a^,  a  * ,  and  B*  are  asymptotically  efficient 

while  the  asymptotic  efficiency  of  a* . :  2sjsm  is  given  by 

lira   Eff(a*.)  =  v~    .  .a2  :  2 < j <m  .       (3.2.46) 

Replacing  v.  1  ._1  by  the  right  hand  side  of  equation 
(3.2,28)  gives 


lim  Eff (a   )  =  U+a?  ,a*   +...+a*   a?  -...a*  +  a2      n2    0al- 
v-»-co        J         JJ-D^       ]"1  ]  *     3~2   ]-l  j- 2  2 


2    "?  7  2  9—1 

aj-laj-2- ' -a2al6  )_X  :  2s3^m  .    (3.2.47) 

Hence  the  asymptotic  efficiency  of  o..k  .  depends  on  the  true 

9 
values  of  the  previous  a's  and  8  .   If  the  true  value  of 

aj-l  is  zero  then  a*-  is  asymptotically  efficient  regardless 

of  the  values  of  any  of  the  previous  a's.   That  is,  the 

asymptotic  efficiency  of  a.  is  dependent  most  upon  the  true 

value  of  a.  ,  the  previous  n,    next  upon  a.  „,  and  so  forth 
j    x  -  3~^ 

with  the  least  dependence  on  a,  and  g  .   If  we  suppose  that 
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all  the  parameters  are  less  than  unity  in  absolute  value  and 
write , 

a  =  max( | a± \ ,  \  a2 \ , . . . , | a  .  1 1 , | S | ) ,  (3.2.48) 

then  we  may  arrive  at  the  inequality 

(l+a2+a4+...+a2(j_2)+a2j)  ;>  (1+a2  .  +a2  ,a2  .+...+ 

22  222  222 

aj-laj_2-  *  -a2+ctj-laj-2'  '  -a2al6    )     '  (3.2.49) 

where  equality  holds  only  when  ja,|  =  |a2j  =  ...  = 
la-j-ll  =  i^l  =  lal-   Tne  inequality  reverses  upon  taking 
reciprocals  and  we  find  the  asymptotic  efficiency  of  a*,  is 
at  least  as  qreat  as 

min  Eff (atj)  =  (1-a2 ( j_1) +a2j-a2 ( j+1) ) (1-a2)   :   (3.2.50) 

2sj  <gn 
Although  {a*-:  2sism}  is  inefficient  this  loss  in  efficiency 
is  more  than  made  up  for  by  the  fact  that  the  distribution 
of  { (a* .-a .  )  :  2sism}  contains  no  unknown  parameters. 

3 . 3   Tests  of  Hypothesis 

Since  the  distribution  of  the  estimators  are  known, 
tests  of  hypothesis  may  be  carried  out  with  ease.   We 
tabulate  here  a  few  hypothesis  of  interest. 

It  is  desired  to  test  the  hypothesis 

Ho:  ak  =  ak+l  =  '••  =  am  =  n0(k=2,3,...,m) 
against  (3.3,1) 

H  :  at  least  one  of  the  equalities  does  not  hold. 

With 

tj  =  (v-j+1)  ^  (g^  ^j.-l-tIq)  :  2*jsm  ,         (3.3.2) 
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define 

Pj  =  P{|t(v_j+1)|  setj}  :  2sjsm.  (3.3.3) 

An  appropriate  test  statistic  for  testing  (3.3.1)  is 

m 
L  =  -2   E  log  (p.) .  (3.3.4) 

j=k    e  ■> 

The  quantity,  L,  has  the  chi-square  distribution  on  2(m-k+l) 
degrees  of  freedom.   The  hypothesis  is  rejected  at  signifi- 
cance level  a  if 

L  >  I  (3.3.5) 

where  i    is  chosen  so  that 

P{X2(m-k+l)*  l}m   a    •  (3-3'6) 

This  procedure  is  called  Fisher's  method  of  combining 

independent  tests.   It  has  been  shown  by  Littell  and  Folks 

[12]  to  be  asymptotically  optimal  over  other  tests  as  judged 

by  Bahadur  relative  efficiency.   The  Bahadur  relative 

efficiency  compares  the  rates  at  which  the  competing  te*st 

statistics  observed  significance  levels  converge  to  zero, 

in  some  sense,  when  the  null  hypothesis  is  false.   The 

interested  reader  is  referred  to  Bahadur  [3]  and  Littell 

and  Folks  [12]  . 

The  above  hypothesis  has  some  interesting  interpretations 

for  choices  of  n  .   If  n   =  0 ,  we  are  testing  whether  the 
o       o  v 

process  is  white  noise  from  some  point  k  on.   In  the  case 
where  n   is  a  constant,  not  equal  to  zero,  we  are  hypothe- 
sizing that  the  time  series  is  stationary. 

Hypothesis  concerning  individual  parameters  can  be 
carried  out  in  the  usual  manner  since  the  distribution  of 
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the  estimator  is  known. 

An  hypothesis  of  importance,  concerning  a  single 

parameter  would  be 

2     2 

against  (3.3.7) 

Ha:  B2  *  62. 

An  appropriate  test  statistic  is 

FQ  =  mc162/(mc1-2)e2  (3.3.8) 

which  has  an  F  distribution  on  v  and  mc.  degrees  of 

freedom,  where  c,  is  defined  in  equation  (3.2.12)  as 

v-%(m+l).   The  null  hypothesis  is  rejected  at  the  a    level 

of  significance  if 

Frt  >  FV  (3.3.9) 

0    mc.,cx  v      ' 

where  F       is  chosen  so  that 
mc,  ,a 

p  <F^L  >   F2L    >  =  a  .  (3.3.10) 

mc.     mc. ,a 

2 

Choosxng  6=1,  the  hypothesis  implies  homoscedasticity 

between  the  initial  observation  and  the  errors  of  the  "noise" 
process. 


Chapter  IV 
THE  MAXIMUM  LIKELIHOOD  ESTIMATORS 

4 . 1  Introduction 

In  order  to  comment  further  on  the  value  of  the 
estimators  given  in  Chapter  III  some  standard  of  comparison 
must  be  employed.   To  this  end  we  study  the  maximum 
likelihood  estimators.   In  this  chapter  we  obtain  the 
maximum  likelihood  estimators  and  examine  their  sampling 
properties.   A  comparison  is  then  made  between  the  maximum 
likelihood  estimators  and  the  starred  estimators  of  Chapter  III 

4 . 2  The  Maximum  Likelihood  Estimators  and  Their  Distribution 
As  in  section  2  of  Chapter  III,  we  suppose  that  we 

observe  a  number  of  (m+1) -variate  column  vectors  y_.:  1  sj  ^n 

(n>m+l) ,  obtained  by  random  sampling  from  an  (m+1) -variate 

normal  population  with  mean  y_  and  dispersion  matrix  V.   As 

in  Chapter  III,  we  estimate  y_  by  y  and  form  the  (m+1  x  m+1) 

matrix  W  by 

n 
w  =  j£i  (Zj-Z)  (Z-fZ)  '  •  (4.2.1) 

W  has  the  central  Wishart  distribution  with   v=n-l  degrees 

of  freedom  and  dispersion  matrix  V.   W  may  also  be  represented 

W  =      Z   z  .z  .  (4.2.2) 

D=l   -  J 
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where  z_,  ,  z~ »  •••»  £.,  (v=n-l)  are  mutally  independent  and 

-j%  Nm+1(-'V)  :  1^£v  •  (4.2.3) 

To  see  this  let  the  (m+1  x  n)  matrix  Y  be  defined  by 

Y  =  (y1'y2'-'-'Zn)  '  (4.2.4) 

and  let  B  be  any  orthogonal  (n  x  n)  matrix  with  last  column 

(i'  A A  )l  -  <4-2-5> 

Define  the  (m+1  x  n)  matrix  Z  =  (zw  z„,  ...,  z  )  by 

—1—2       — n    J 

Z   =  YB  .  (4.2.6) 

We  note  that 

z   =  /n  y.        •  (4.2.7) 

Now  W  may  be  written  as 

W  =  YY'-n;^'  ,  (4.2.8) 

and  since  B  is  orthogonal  (BB'=I)  we  may  write 

YY*  =  (YB)  (YB)  ' 

=    ZZ*     ,  (4.2.9) 

and   upon   substituting   YY'    =    ZZ'    and    z_     =    /n  y_   into 

(4.2.8)    we    find 

i  i 

W   =    Z    Z      -    z    z 
— n— n 

n  i  ' 

y      Z  .  Z  .      -      Z     Z 

=   j^-D-D         -n-n 

n-1 
=    j^ljij  •  (4-2.10) 

Hence  we  have  the  representation 

n 
w. 


\j  =  ^a-yo^ji  -yj) 
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I  Z.,Z.,   :  csism;  osjsm  .       (4.2.11) 
=1 

Hence  forth  we  shall  use  W  =  ZZ   keeping  in  mind  the 
representation  given  in  (4.2.11). 

Assuming  that  Y(t)  =  u  (t)  +  X(t),  where  X(t)  follows  the 
first  order  generalized  autoregressive  process  then  VeV  and 
has  the  properties  given  in  section  2  of  Chapter  II.   Since 
it  is  through  "W  that  we  obtain  estimates,  the  elements  of  W 
serve  as  observations  and  the  likelihood  of  this  set  of 
observations  is  L  =  f (W)  given  in  (3.2.25).   Taking  the 
logarithm  of  (3.2.25)  and  utilizing  the  form  and  properties 
of  V  we  obtain 

log  L  =  C  -  '±sv(m+l)  log  o2  -  ~  logB2 

+%(v-m-2)Iog|w|-%trV  W  ,  (4.2.12) 

where  C  is  a  constant.   Recalling  that  A  =  V_J"  has  the 
special  form  given  by  (2.2.11)  we  may  write 
txV'hrt   ■  trAW 

12-1      m       m    7 

(4.2.13) 

Substituting  (4.2.13)  into  (4.2.12)  we  find 

log  L  =  C  -  %v  (m+1)  logo2  -  j   log  °?'   +  h  (v-ra-2)  log  |w| 

-it(82)--1w00+jiw..+ji(a2„._li._l-2ajwj_lij)} 


(4.2.14) 


Differentiation  of  (4.2.14)  with  respect  to  a-  yields 


3  log  L      1   ,  ,     ,  . 

3aj   =  "  ~2    {ajwj-l,3-l  "  Wj-l,j}   :  ^^   '         <4-2'15) 
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2 

differentiation  with  respect  to  a   yields 


m 


,2  -2  '  4     l  lts    '       w00  ._,       jj 

da  2a  2a  3=1      JJ 


+  Z    (afw.  .  .  ,-2a.w.  .  .)}  ,  (4.2.16) 

j=1   3  D-l/D-1    3  J-1'D 

2 

and  finally  differentiation  with  respect  to  B   gives 

9  log  L  _-_v_     W00  #4  2  17) 

?    ~    2      2  4  v«*.^.x^ 

9SZ     2B    2a  3 


Settina  {i^-Ji    .  isjsm}  ,  8  lo1j  L  ,  and  9  l0f  L   equal  to 
8aj  3a2  8B2 

zero  and  solving  we  obtain  the  maximum  likelihood  estimators: 
aj  =  wj-l,JWJ-l,J-l   :  1SjSn  '  (4-2'18) 

a2  =  [vinti)]-^"^  +  .^  (Wjj-w2.^^-.1^.^)}, 

(.4.2.19) 

and 

B2  =  (v52)_1w00  .  (4.2.20) 

~2 

Eliminating  B   from  equation  (4.2.19)  yields 

oZ   =    (vm)  -1   Z  (w..-wf  .  ,w  t       .       )  .       (4.2.21) 
-;=^   33   3-J-'J  J  X'J  x 

We  now  proceed  to  determine  the  distribution  of  the 
maximum  likelihood  estimators.   In  order  to  do  this  we  shall 
use  conditional  arguments   frequently.   We  shall  write 

y|z  *    f(0  (4.2.22) 

in  order  to  imply  that  "y  conditional  on  z  has  the  density 
...  ."   Using  the  representation  of  w. .  given  in  (4.2.11) 
one  has 
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£    z  ■    ,    ,  Z  ., 
k=l    3-1A    ^ 

k-1    3"1'k 


:    lijsan    -  (4.2.23) 


Letting 


v 
o.    =      I      <f>.    ,    kz-k    :    l^j^m       .  (4.2.25) 


$         _  Z jk         :    osjsan    ;    ltCksv       ,  (4.2.24) 

jk    •         v      2 
E       jk 
k=l 

we  may  write 

Recalling  that 

zk  *  Nm+1(-'V>  :  1$k-v  (4.2.26) 

when  VeV,  we  must  be  able  to  represent  z  ...  by 

z..  =  a.z.  .  ,  +  e..  :  ls:jsm;  lsksv  ,  (4.2.27) 

}k    3  ]-l, k    3k 

where  e.,  are  independent  identically  distributed  normal 
3k1 

random  variables  with  mean  zero  and  variance  a  . 


Kence 


:.lz.  -  .  a.N(a.z.  .  ,,a2):  Isjsm;  ls&sv  .        (4.2.28) 
3k'  3-1, k     3  3-1, k' 


Using  (4.2.28)  in  (4.2.25), 


2  J  .2 


aj  I  {2j-lA:1Sk^V}"N(c'jki1<,)J-l,kzj-l,k;a  ^/j-lrk1  ' 

(4.2.29) 

Since  v   0 

E  z .  ,  , 

v  .  ,  D-l,k 

i  ♦j-i.k'j-i,*  "  TX--  -!l  (4-2-30) 


k=l 


k=1  '  '  -  '    *,zj-i,k 
If- ' 

v 


K_±  I  zj-l,k 

k=l 
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we  find  upon  their  substitution  into  (4.2.29)  that 

2       v  7  -1 

a.  |{z.  .   :  ls;ks:v}^N(a.,a  (  E  z^  .   )  x)  .       (4.2.32) 

J    j  x / K  J     k=l 

To  complete  the  derivation  we  note  the  following. 
Let 

u  ^  N(y,cr  )  (4.2.33) 

and  v  -\*  a2X2(0)  ,  (4.2.34) 

then  Student's  t-distribution  is  defined  as 

t  =  (u  -y)  (1  v)_?5  .  (4.2.35) 

We  note  that  the  distribution  of  t  conditional  on  v  is 

t|wN(0,va2v-1)  .  (4.3.36) 

Hence,    by  analogy,    the  distribution  of  a.    is   also   t   and 
we   have 

vh    (    -'"V'V    '     ("i    "    ai)  %V0):    ls^rn    '  (4.2.37) 

c 
v  „ 

,  oo         n2 

where     — =—  =    $      , 

a 

and  Ilil         #1*2         ,2         2  ,22  2    , 

-^  =    (1   +   ttj-1   +   Oj.^.2   +    ...    +   a._iaj_2...a2   + 

aj-laj-2"-a2alB2)     :    li^ra    .  (4.2.38) 

~2 

To  find  the  distribution  of  a  we  define  the  (vxl) 

column  vectors  9.  =  (0.,,9.„,  ...,  8.  )'  by 
— j      jlj2         jv    2 

6jk  =  (  Z    zfk)  2zjk  :  lsksv;  osjsm  ,         (4.2.39) 
and  note  that 


6  .  9  .  =1  :  Osrjsm  .  (4.2.40) 

-D-D 


In  terms  of  the  (m+1  x  v)  matrix  Z  we  have 


•j  =((Z)  .  (Z)  j.}-3s(Z)  !,   ,        (4.2.41) 
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For   convenience   we    take 


and   write 


nj    =  Wjj    '   wj-lfjwj-l,j-l    :    1Sj*m    '  (4-2.42) 


,2  1       m 

mv    .-=1  'j     .  (4.2.43) 

Using  £.  we  may  write 

n-    =    (Z).     (I    -G.         e'    ,)(Z)|       :    lsijsm    ,  (4.2.44) 

J  J         vJj-Jx  j. 

where    I      is    the    (vxv)    identity  matrix.         Consider    the 
distribution   of    n      conditional    on    { (Z) .     :0sjsm-l}.      Now 

(Z)mJ{(Z)j>:    0^j^m-l}^Nvxl(am(Z)m_1>.;a2Iv).     (4.2.45) 

Conditional  on  {(Z)  .  :  osjsn-1}  the  matrix  (I   -  9   ,  6'  ,  ) 
J-  v    -m-l--m-ly 

is  symmetric  and  idempotent  with  rank  (v-1) ,  and  the 
quadratic  form  nm  follows  the  non-central  chi-square 
distribution,  that  is 

Hm|{(Z)j.:  0^jsm~l}^  a2XJ_1(Tm)  (4.2.46) 

where 

v  =  ^tz»m-i,(vViCi'(zC-i,  .       <4-2-^) 

Upon  replacing  9,  by  its  definition  in  (4.2.47)  we  find 

Ym  =  0  .  (4.2.48) 

Since  none  of  {(Z)  .   :  Osjsm-1}  enter  into  the  distribution 
of  nm  we  have  the  unconditional  distribution  is  the  same  and 
hence  n^  is  independent  of  {(Z)  .   :  o<;j  sm-1},  that  is 


.2.  2 

(v-1) 


^  %   °   X/„_-n  (0)   .  (4.2.49) 


The  distribution  of  n  ,   ,»  conditional  on  {(Z).#:  os;j<jn-2} 
can  be  obtained  in  exactly  the  same  manner  and  since  it 
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depends  only  on  {(Z) ..  :  Osjsm-2}   it  is  independent  of 

H  .  By  the  same   arguments  as  above  n  -   -,  N  can  be  shown  to 
m  '  tm-1) 

be  independent  of  {(Z) ..:  c^jsm-2}  and  hence 

n,m_n^  o2X2      (0)  .  (4-2.50) 

1   x;      (v-1) 

In  exactly  the  same  manner  we  can  show  ru,  ■■  is  independent 
of  iv  and  hence  so  is  n,k+2w  ^  tv  +  2)  '     •••'  ^   since  they  are 
independent  of  n^,^  •   Again  the  distribution  of  r\      is  free 
of  {(Z) .  :  Osjsk-1}  so  that  the  unconditional  distribution  of 
n,  is  identical  to  n   .   In  this  way  we  argue  that  { n  .  :  lsjsm} 

K.  Ifl  ) 

are  mutually  independent  and  identically  distributed  with 

1j^^2X2v_1)  (0)  :  lsjsm  .  (4.2.51) 


,2    1 


m 


Since  d     ~   —  E  n .  we  have 
mv  j=1  3 

d2   *iv  4<v-l)<°>  .  1,4.2.52) 

In  the  above  arguement  we  note  that  the  variables 
{n  ,r\ .      , .,  .  .  .  ,  n,}  are  independent  of  {(Z).:  o^j^k-1}. 
Hence  we  have  {rU'rW   -,w  •••>    n-,}  are  independent  of  (Z)Q#, 
and  hence  of 

W00  =  (Z)0.(Z)0.   •  (4.2.53) 


Since 


then 


Z0k  °°  N(°'a232)  =  lsksv  ,  (4.2.54) 


w00  %   °2?,2A    (0)  •  (4.2.55) 


~2       ~2  -1 

Since   3   =  (va  )   W0Q/  then  we  have 


42 


(V_1)  °2*   pv      (Q)  _  (4.2.56) 


v32       ra(v-1) 

4.3  Properties  of  the  Maximum  Likelihood  Estimators 

Since  the  distribution  of  the  maximum  likelihood 
estimators  is  known  their  properties  are  easily  obtained. 
We  find  using  equations  (4.2.37),  (4.2.52),  and  (4.2.56) 
that  £(a.)  =  a.  :  lfij^ri  ,  (4.3.1) 

e(a2)  =  (v-l)v~1a2    ,  (4.3.2) 

and  £(S2)  -  mv[m(v-l)-2]_132  .  (4.3.3) 

A  2      "  2 
Hence  the  a.  are  unbiassed  estimators  while  a   and  3   are 

biassed.   Since  unbiassedness  is  a  desirable  property  we  shall 

2       2 

use  the  unbiassed  estimators  of  a      and  3   in  calculating  the 

rest  of  the  properties. 

-  Letting  £  denote  the  (m+2  x  m4-2)  variance-covariance 

matrix  of  the  (m+2)  dimensional  vector  of  unbiassed 

_  -i  n  —  1  2 

estimators  (6L,  a~,    •--»  a  ,  v(v-l)  Ja  ,  [m(v-l) -2]  (mv)   3  ), 


we  find 


(IK,  =      (v-2)  13  2  ,  (4.3.4) 


(£)..  =  (v-2)  Xv  *   .   a2:  2^jgm  ,   (4.3.5) 
J  J  J--L/Jx 


where  v.  ,  .  .  is  defined  in  (4.2.38)  , 
I-1/  J--I- 


and 


(*W,m+l  =  2[m(v-l)]-1a4  ,  (4.3.6) 

{^}m+2,m+2  =  2 [m( v-1) +v-2] [rav (v-1) -4v]_134        (4.3.7) 
(i)m+l,m+2  =  <*>m+2,m+l  =  "2  [m(v-l>  T  W        (4.3.8) 


(Z)  ..  =  0  :  elsewhere  .  (4.3.9) 


43 


To  see  that  the  set  of  estimators  (a..:  lsJsP)  are 

independent  of  the  set  {d2 ,  l2)   we  note  that  the  distribution 

of  d2  and  B2  are  free  of  the  elements  of  a.  and  hence  the  two 

sets  of  estimators  are  independent.   To  see  that  the  covari- 

ances  between  the   a's  are  zero  we  note  that 
v 
3         k=1  3 -i»K  3,k 


=  kyj-l,k(aja,j-l,k+Ejk) 

=  o.+  E  ^_lk^  k    =  l^*m  ,  (4.3.10) 

where  <j>  .  ,is  defined  by  equations  (4.2.24)  as 
j  /k 

3/k 


3'k     v   2 


v 


k=l  D,K 


H 


ence  the  covariance  between  a.    and  a£  (j?**)  is  given  by 

(Z)  .£  -  £(aja£)-£(aj)e(a£) 

v  v 

=  e[(aj+k^j_1>kejk)(aJl+ky£_lfk^k)]  -  a.a£ 

v  v 

=  £(«j^+ctjk_y£_1/ke,k+a2kyj_lfksjk 

+  *    Vj-l,^!^-!,^^^    '  (4'3-11) 
i=l  k=l 

Since  the  {e   ilsssm,  is  t£v}  are  independent  identically 

distributed  normal  random  variables  with  zero  mean,  we  have 

taking  expectations  first  with  respect  to  the  egt's  that  the 
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last  three  terras  in  brackets  in  (4.3.11)  vanish  and  we  are  left 
with 


(Z)j£ 


aja£-aja£ 


=  0  :  l£j^££m  .  (4.3.12) 

Although  this  shows  the  estimators  are  uncorrelated  it 
is  not  true  that  they  are  independent.   To  see  this  we 
examine  the  case  for  m=2 .   Write 
W=GDG ' 


d0g10 
d0g20 


VlO 


WlO 


dlg21+d0g10g20 


d0g20 


dlg21+d0g10g20 
d2+dlg21+d0g20 I 

(4.3.13) 


Hence, 


a.  = 
1   w 


01 
00 


and 


=  g 


10  ' 


a„  = 
2    w 


'12 


11 


(4.3.14) 


We  shall  show  that 


(d0g10g20+dlg21) 


d0g10+dl 


(4.3.15) 


a2|(g10,d0,d1)^N(a2/        2   )  , 

di+dogio 

by  equation  (4.3.14)  this  is  equivalent  to 


(4.3.16) 


a  i  (a1,d0,d1)^N(a2, 


di+doai 


45 
(4.3.17) 


Since  the  conditional  distribution  of  a,,  depends  on  a,  we 
have  that  they  are  dependent.   To  show  (4.3.16)  v/e  need  the 
distribution  of  (g  _,g   ,g  .)  conditional  on  (d .,d.,d»). 
Referring  back  to  (2.5.21)  we  have 

9-0  =  (9l0'920)  '  I  'V^V^O'V  (4.3.18) 

where   u_n  =  (a.^.a.)1  (4.3.19) 


V0  = 


2 

>o 

l2  a: 


o   ,   2  a 
d0    2  do 


(4.3.20) 


and 

2 
g2l  I  (d0'dl)a'N(a2'd~  ]  (4.3.21) 

independent  of  (9io,g20^  *   Now  the  distributi°n  of  9™ 
conditional  on  (g. 0/dn,d, )  is  easily  shown  to  be 

2 
920l<910<d0'dl^N(V*10'd^  >  '  (4.3.22) 

Since  a        is  independent  of  g-,n,  conditioning  on  gin  does 
not  affect  the  distribution  of  g?1 ,  that  is, 

2 

g2ll(g10'd0'dl^N(a2'd7  }   • 

We  note,  by  equation  (4.3.15),  that  a„  conditional  on  (gin/ 
dQ,d, )  is  simply  a  linear  combination  of  g_0  and  g„, .   Since 
they  are  normally  distributed  (conditionally)  so  will  &„ 
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(conditionally) .   All  we  need  do  is  calculate  the  mean  and 
variance  to  find  the  conditional  distribution  of  a2  . 

(dflglng9n+d1g21) 
ea2|(g10,a0,d1)=  e(     °  ]°  f  J        }   ii?10-Vdi> 

aogio     1 

a2(d0.g10+dlI 


dOgiO+dl 

=  a2  .  -  (4.3.23) 

Var(a2)  |  (g^d^)  =  Var{  ( VlO^O**!^!*  }  ,  (g    d^) 

d0g10+dl 
recalling  that  g2Q  and  g„,  are  independent  we  have, 

dQg10  Var^20)  1  (g10^Q,d1)}fd^Var{(g21)  |  (g^d^)} 


'0*10    r 


2        2 

,2  2   a   j.  j2  a 

dogio  d^  +  di  dl 
<*o^o+di)2 


2 
_2 (4.3.24) 


(dng?n+cU 


lono    i 

Hence 


2 

a 


a?l  (g-,n'dn'di),XjN(a2'  2 ) 

as  was  to  be  shown. 

Furthermore  it  can  be  shown  that  a,  and  &2  do  not  have 
a  bivariate  t-distribution.   To  see  this  we  find 
f(a2|a1  =0).   We  have  that 


a 
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a2  l(al/do'dl)^N(a2'TTi } 

and  hence 


Vl+dl 


2 

a2|  (a1=0,d0,d1)^N(a2,~  )  .  (4.3.25) 

Since  the  distribution  does  not  depend  on  dn  and  since 


d„  and  d,  are  independent  we  have 


and  from  (2.5.5) 


2 
a2| (a1=0,d1)^N(a2,|r  )  .  (4.3.26) 


dlua2Xv-l(0)  '  (4.3.27) 


so  that 

d  d 


f (a2,d1|a1=0)  =  - 


o\* 


1        2  i 

— 2-(a2-a2)  fl%(v-l)-l   2a2 

2a  1        e 

e  


Mx;  (2a^)       r(Mv-l)) 


-a»<a2<oo   .      (4.3.28) 
d1>0 


(v-l)>0 

Integrating  over  d, ,  we  have 

dl     «  2 

— T[l+(a  -a,)^] 

~  ,A         i    di     e                 d  d, 
f(S  |a  «0)  =   J   -i — LI 

0    /2?  UaV^H^v-l)) 


r(3gv)  [l+(a0-a0)2]-32V 


r(Js)r(%(v-l))  ~    :—<a2<~         (4.3.29) 

v>0  . 


Hence  we  find  a 2  conditional  on  c^  =  0  has  a  t-distribution 
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with  (v-1)  degrees  of  freedom,  but  we  know  S   has  the 
t-distribution  with  v  degrees  of  freedom,  this  will  show  a 
contradiction  and  a,  and  a_  cannot  have  a  bivariate  t- 

1  z. 

distribution. 

Now  suppose  t   and  t_  have  a  bivariate  t-distribution 
with  v  degrees  of  freedom.   Their  joint  density  is  given  by 


fl.       «.    v    _    T(h(v  +  2))   III     2 

77  r  {hv)  [1+h Wj  z    U  -u  ;]  i-i  2  • 

2    M2  2    M2  x'        ' 

v>0    , 

(4.3.30) 

where  u,  and  y_  are  the  expected  values  of  t   and  t~ 
respectively.   Also  I  is  the  (2x2)  variance-covariance  matrix 
of  (t  ,  t  ) .   Relating  this  to  a.  and  a   we  would  replace 
(Vi-,y_)  ^Y  (ai'a2^  anc-*  ^  would  be  a  diagonal  matrix  since 
we  have  shown  the  covariance  of  ex..  and  a.,  to  be  zero.   With- 
out any  loss  of  generality  we  may  take  \x   =\i   =0  and  Z  =  I.   The 


marginal  density  of  t,  is 


Hence  the  conditional  density  of  t-  given  t,  is 
f(t,,t9) 

r(1-§(v+2))   u+t*]3s(v+1) 

1  :  -co<t  <«  ;      (4.3.32) 


T^n^v+D)  [i+tJ+t^]^(v+2) 


V20 
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In  particular,  suppose  t,=0,  then 


f  (tJt  =0)  =  -r L^L^+2))  0  Wv  +  2,  :  -»<t2<«;       (4.3.33) 

That  is,  t~  conditional  on  t,  =0  has  the  Student  t-distribution 
with  (v+1)  degrees  of  freedom.   Hence  we  see  that  if  (cL,a_) 
are  bivariate  t-distributed,  then  since  a~  conditional  en 
a,  =  0  has  the  Student  t-distribution  with  (v-1)  degrees  of 
freedom  it  must  be  that  a„  has  the  Student  t-distribution  with 
(v-2)  degrees  of  freedom,  but  this  is  a  contradiction  since 
a„  has  the  Student  t-distribution  with  v  degrees  of  freedom. 
Hence  (cL,a?)  do  not  have  a  bivariate  t-distribution. 

Hence  we  see  that  the  maximum  likelihood  estimators  are 
not  independently  distributed  and  their  joint  distribution 
is  not  multivariate  t.   This  of  course  is  a  drav/back   in  using 
the  maximum  likelihood  estimators  and  accentuates  the  benefits 
of  using  the  starred  estimator,  which  are  independent  and  have 
the  t-distribution. 

It  is  easily  seen  that  the  unbiased  estimators  are 
consistent  since  the  variance  tends  to  zero  as  the  sample 
size  increases  without  bound.   To  find  the  efficiency  of  the 
maximum  likelihood  estimators  we  compare  their  variance  to 
the  minimum  variance  bounds  given  in  equations  (3.2.37)  through 
(3.2.41) .   We  find  that 

Eff  (a.)  =  (v-2)v_1  :  lsjsm  ,  (4.3.34) 

Eff  (v(v-l)_162)  =  (v-l)v-1  ,  (4.3.35) 

and  Eff ( [m(v-l)-2] (mv) _1B2) = [ (m+1) (m(v-l)-4)] [m (m (v-1) +v-2] _1 . 

(4.3.36) 
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It  is  obvious  that  the  estimators  are  asymptotically  efficient. 
Unlike  the  starred  estimators  the  efficiency  of  the  maximum 
likelihood  estimators  does  not  depend  on  the  unknown  parameters, 
but  the  distribution  of  the  maximum  likelihood  estimators 
{St.:  lsj  sea]    depend  on  the  unknown  parameters  so  that  tests  of 
hypothesis  of  a.  depends  on  knowing  the  values  of  a,.,a_,  ..., 
a._1  .   This  clearly  shows  the  trade-off  between  the  starred 
estimators  and  the  maximum  likelihood  estimators.   While  the 
starred  estimators  are  inefficient,  tests  of  hypotheses  are 
performed  with  no  difficulty,  and  vice  versa  the  maximum 
likelihood  estimators  are  efficient  (asymptotically) ,  but 
test  of  hypotheses  are  complicated  since  their  distribution 
depends  on  several  unknown  parameters .   Moreover  the 
dependence  between  the  a's  also  causes  complications  in  making 
tests  of  hypotheses  concerning  two  or  more  of  the  parameters 
since  the  joint  distribution  may  be  very  complex. 


Chapter  V 
A  TEST  OF  THE  ADEQUACY  OF  THE  MODEL 

5 . 1   Introduction 

Throughout  we  have  assumed  that  the  process  is  adequately 
described  by  the  first  order  autoregressive  model.   In  this 
chapter  we  propose  a  method  of  testing  the  validity  of  this 
assumption.   Due  to  the  assumption  of  the  first  order  auto- 
regressive process  a  class  of  dispersion  matrices  arose  which 
we  identified  by  V.   Since  this  class  of  dispersion  matrices 
is  a  consequence  of  the  model,  a  test  to  validate  the  model 
is  equivalent  to  a  test  of  H_:  VeV  against  H  :V  is  an 
arbitrary  positive  definite  matrix. 

In  order  to  arrive  at  a  test  statistic  for  testing  this 
hypothesis  we  recall  that  if  VeV  then  V=AUA'  where  A  and  U 
were  defined  in  equations  (2.2.4)  and  (2.2.7).   In  particular 
U  was  given  as  the  (m+l)x(m+l)  diagonal  matrix 

U=diag(a232,a2,a2,...,a2)  .  (5.1.1) 

We  also  showed  that 

2  2 

dj^a  Xv_j(0)  =  lsj*m  ,  (5.1.2) 

and  with  v  large  compared  to  m  each  of  the  d.'s  should  be 
nearly  the  same.   Ignoring  the  first  row  and  column  of  U  we 
have  that  the  remaining  diagonal  elements  of  U  are  a2  and 
{d .  :  l<rj<mi}  are  independent  estimators  of  this  quantity.   If 
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HQ  is  true  then  all  of  the  d^'s  should  be  equal.  Another 
way  of  putting  this  is  that  the  arithmetic  mean  of  d.,d., 
'dm  is  ec3ual  to  the  geometric  mean,  that  is, 


m 

n  d. 
i=l  x 


V~    m 

(  -  I         d  .  : 
Uj=l    V 


m 
-      11 

d. 

1 

i=iW 

m 

m          J 
E      d. 
1=1      J 

=    1,  (VsV)  .      (5.1.3) 


If  H0  is  false  then  the  d.'s  will  not  be  equal  and  the 
arithmetic  mean  will  be  larger  than  the  geometric  mean  and 
A^  will  be  less  than  one.   Hence  we  see  that  we  reject  H   for 
small  values  of  X    .      The  asymptotic  distribution  of  A   will 
be  investigated  in  section  2.   This  test  has  an  interesting 
geometrical  interpretation.   If  we  consider  the 
{d.:  l&jsm}  to  be  the  squared  lengths  of  the  principal  axes  of 
an  m  dimensional  ellipsoid,  the  above  hypothesis  specifies 
that  these  are  all  equal,  that  is,  that  the  ellipsoid  is  a 
sphere.   Hence  this  test  is  the  sphericity  test  on   the 
{d.:lsjsm}. 

Besides  the  sphericity  test  we  consider  another  test, 
independent  of  X^,    based  on  {gi-  :  0£j<i£m}.   The  statistics 
{gj_j:  0^j<ism}  contain  a  great  deal  of  information  about  the 
process  since  they  are  used  as  estimators  of  [a..:    lsjsm}  and 
hence  of  V.   In  section  3  we  shall  investigate  the  distribution 
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of  a  function  of  these  statistics  under  the  hypothesis  rhat 
VeV.   In  section  4  we  discuss  combining  the  two  tests  given 
in  sections  2  and  3  and  the  asymptotic  equivalence  of  the 
combined  tests  as  compared  to  the  likelihood  ratio. 


5-2   An  Approximation  to  the  Distribution  of  -p   log  X 

Referring  to  equation  (5.1.3)  we  see  that  X     may  be 

. ,u5, . . . ,u 


written  as  the  product  of  u. ,u0#...,u   where 


that   is, 


d. 
ui   =  — 5 :    Isism    ,  (5.2.1) 

m .         1 
D  =  ;    J 


Ai    =      n   u.     .  (5.2.2) 

i=l    1 


Rather  than  consider  the  distribution  of  A   we  shall 
consider  the  distribution  of 

n  =  -p  log  \      :  0^n<ro    ,  (5.2.3) 

where  p  is  some  constant.   The  moment  generating  function 
of  n  is 

Ve)=  ee9n  ' 


=  e(X1)-eP 


m 
C(  n  u  )"0p   .  (5.2.4) 

i=l 


In  order  to  find  this  expectation  we  need  to  find  the  joint 

distribution  of  (u^i^,  .  . .  ,Uja)  .      To  find  the  joint  distribution 

we  shall  transform  from  (d  ,d, ,  .  .  .  ,d  )  into  (u, ,u-, . . . ,u  ,S) 

±      z  m         12      m 
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where  u.  is  defined  by  (5.2.1)  and  S  =  -   Ed..   Hence 

3=1  J 
we  seek  the  Jacobian  of  the  transformation  from  (d, ,d~ , . . . , 

d  )  to  (u,,u2, . . . , u  , S)  defined  by 

d±  =  u±S   :  Isism  ,  (5.2.5) 

with 

m 

Z  u,  =  m.  (5.2.6) 

i=l  1 

Let  [6u.]  and  [&S]    denote  small  changes  in  u.  and  S, 

respectively.   Suppose  that  the  changes  [6u.]  in  u .  and 

[6S1  in  S  bring  about  a  change  [6d.]  in  d.  so  that  (5.2.5) 

and  (5.2.6)  is  preserved.   That  is 

d±+[6di]  =  (ui+[6ui]J  (S+[6S])  :  lsism  ,     (5.2.7) 

and 

m 

Z  (u.+[5u.])  =  m    .  (5.2.8) 

i=l  1    1 

Expanding   the  above  equations  and  dropping  terms  of  second 

order  in  the  [<S*]  [*e  (d.  ,S)  ]  ,  we  find  that 

di+[6d±]  =  v^S  +  [6u±]S  +  u.[6S]  :  Isism  ,  (5.2.9) 

and 

m       m 

Z    u.    +      Z    [fiu.J    =  m    .  (5.2.10) 

i=l   1        i=l        x 
m 
Since   d.    =   u.S    and      Z      u.    =  m  we    see    that 
1  i=l      1 


[6d±]    =    [fiu^S    +   u±[6s]     :    lsism    ,  (5.2.11) 

and 

m 

Z    [6u.]    =   0    .  (5.2.12) 

i=l        1 

To  write  the  above  in  vector  notation  we  define  the  (mxl) 
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column   vectors 


and 


[<5d]    =    ([6d1],  [6d2]  ,...,  [5dmD  '     , 


d      =    ( d ., d  _,..., d)  ' 
—  L      z  m 


[5u]    =    ([6ux] , [6u2] ,..., [6um]) '     , 


H     =    <VU2 V 

in  =    d'l 1)'       • 


(5.2.13) 

(5.2.14) 

(5.2.15) 

(5.2.16) 
(5.2.17) 

(5.2.18) 

(5.2.19) 


Equations    (5.2.11)    and    (5.2.12)    may   now   be   written 

[<5d]    =    [6u]S   +   u[6S] 
and 

i;t6u]  =  o  . 

Equations  (5.2.11)  can  be  thought  of  as  a  singular 
transformation  from  {[fid-J  ,  [6d2]  ,  ...,  [<5d  ]}  to  {[6u,]  ,  [Su,]  , 
...,  [5u  ]  ,  [5S]}  made  one-to-one  through  use  of  equation 
(5.2.12).   Saw  [17]  has  shown  that 

J{d+u,S}  =  J{  [6d]-[6u]  ,  [5S]  },  (5.2.20) 

where  J{ [6d] +  [6u]  ,  [&S]  }  is  the  Jacobian  of  the  transformation 
defined  by  (5.2.18),  in  which  u  and  S  are  considered  fixed. 
Choose  P  to  be  an  orthogonal  mxm  matrix  with  the  first 
row   equal  to  1'  and  pre-multiplying  equation  (5.2.18)  by  it 
gives 

v  =  P[6d]  =  P[6u]S  +  P  u[5S] 


S  + 


[5S] 


(5.2.21) 


where  y,  =  0  since  1' [Su]  =  0. 

From  equations  (5.2.20)  and  (5.2.21)  we  have 
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J{d+u,    S}    =   J{[od]  -  [5u],     [fivS]} 

=   J{ [6d]  - v} -J{v-y2, .. . ,y    ,  [SS]  }.  (5.2.22) 

The  Jacobian,  J{  [Sd] -*-v}  ,  is  unity  since  P  is  an  orthogonal 
matrix.  The  Jacobian,  J{y_->y  ,  .  .  .  ,y  ,  [SS]},  is  the  modulus 
of  the  determinant  of  the  (mxm)  matrix  K  with  elements 

(K)11    9  [SS]  "  1    ' 


3v. 
(K)lj  =  J[6S]    =  °  :  2sj^  ' 


3v. 

(K)  .  .  =  ^—2-  =  S    :  2sjsm  , 
31         3Yj 


3v. 

(K>v-i  =  TT3,  =  0    :  elsewhere   .  (5.2.23) 

kj    3yk 

Hence  K  is  a  diagonal  matrix  and 

J{v-y2,...,ym,  [SS]}  =  ||k||=  Sm_1       (5.2.24) 

and  finally 

J{d->u,  S}  =  Sm-1   .  (5.2.25) 

Since  the  d.'s  are  independent  with 

dj^a2xJ_j  (0)  :  lsjsm  ,  (5.2.26) 

then  ^v.  ,   — ^d. 

m   d .      e  d  >0 

f(d  ,d 2,...,d  )  =   H    -J— ^- =   j       (5.2.27) 

3=1   (2a2)   3  ^   ,   ;  l^D«i 

where  v.  =  v-j:  1  sj  sm   .   Hence  we  find  the  joint  distribution 
u  =  (u,,u2,...,u  )'  and  S  is 
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Jsv.-l 

m     m"  u  . 

f(UlfU-f...,UfS)  =iT(h     I     V.)   n  .'-3 ;— )  } 

12      ra        j  =  l  3  j-iVrC^m^V 

m 
(S)  ^  D    e  2° 


m 

2   i=l  ^      m 

(2£2)    :  x       rc%   £  V.) 

m  j  =  l  D 


:  Osu.sm,  Z  u  .  =m ;  S>0  .  (5.2.28) 

Hence  we  find  u  and  S  are  independently  distributed  with 

a2  2 
S   m  x  t    m    )  (0)  '  (5.2.29) 

*  Z    v  . 
3  =  1  D 

and  u  is  distributed  as  rnZ  where  Z_  has  the  Dirichlet  distri- 
bution with  (mxl)  dimensional  parameter  vector  (1^v1  ,^v„,..  ., 
55Vm)'  =  (^(v-l),^(v-2),.  .  .,3s(v-m))'. 

If  (y,  ,  y"  ,  ...,  y  )  has  the  Dirichlet  distribution  with 
parameters  (a  ,  a  ,  . . . , a  ),   then  the  moments  about  the 

origin  are  given  by   -~  --■. 

3        v  J  n  n 

{  n  r(o.+r.)  }r(  z  a.) 

i»S.    vl2...ynS    -    fei  ^       .5.2.30, 

r(  E  (a.+r.)){  n  r(o.)} 

j=l   D    D     j=l     : 

Kence  we  find  the  moment  generating  function  of  n  is  given 

by  -- .,  ,      :^   s     H :_~ 

<J>n(9)  =  ee9n 

=  e(  n  u.)_ep 

-  =   --  -  :i=l  1 


and    letting  ui   =  mZ.^   gives 


m      -80 

,  (6)=  e(  n  mz  ) 
nv     i=i   ! 


=  m"m9p  .e<  n  z  -ep) 

i=l 

and  since   Z.:  lsi^m   are  Dirichlet  we  have  from  (5.2.30) 
1 

that  m  m 

{  n  r^v.-ep)}  .  v(h  1   v.) 

♦  „(6)  =  nfm9p  -1=1 _^i .     (5.2.31) 

r(  1  tev.-ep)  {  n  n^v.) } 
j=i   D     j=i   D 

Since  the  moments  of  n  are  functions  of  gamma  functions 
we  can  apply  Box's  [4]  method  to  obtain  an  approximation  to 
the  distribution  of  1.  A  good  discussion  of  Box's  method  is 
also  contained  in  Anderson  [1] . 

Using  equation  (5.2.31)  the  cumulant  generating  function 
for  1  is 

V    (9)    =   log<f>n<e) 
n  m 

=  k-m9plogm+   I    logr  (%(v-j)  -6p)  -logr  (%(mv-%m(iu+l) 

j  =  l 
-m9p)),  (5,2.32) 

where  k  has  a  value  independent  of  9.   Rewriting  this  as 

Y  (8)=k-mp81ogm+  E  logr (a  .+hp ( 1-28) ) -logr (B+^mp (1-29 ) )  (5 . 2 . 33) 

n  j=i     3 

where 

a.  =  ^(v-p-j)  :  lijsm, 
and  (5.2.34) 

3  =  Jg[mv-mp-%m(m+l)  ]  . 


59 


We  use  the  expansion  formula 

log(x+h)  =  hlog   2tt  +  (x+h-1^)  logx 

-   (-l)rB  .,(h) 

-  x  -  I      — (5.2.35) 

r=l    r(r+l)xr 

where  B  (h)  is  the  s-th  Bernoulli  polynomial  defined  by 

hx     °°    s 
le    -  Z      Jr  B  (h)  ,  (5.2.36) 


(eT-l)   s=0 
for  example 

B1(h)  =  h~|  ;  B2(h)  =  h2-h+|  ; 

3    3   2    1 
B3(h)  =  h   -  |  h   +  |  h  . 

Using  the  expansion  formula  (5.2.35)  on  (5.2.33)  we  find 
that 

¥n(9)=k+(m^1)  (log    2^-log   |)  -  (B+^-i)  log   m 


4(m-l)log(l-29)+    £    tt*    { -}  (5.2.37) 

Z  r=l   r       (l-28)r 


with 


(5.2.38) 


t  B      ,  (6)  m 

ur  =    (-Dr(-^ 1 1Br+l(aj>} 

m j=l _ 

r(r+l)  (VaP)r 
By  virtue  of  the  fact  that  ¥  (6=0) =0  we  must  have 

°° 

kl  (IP:1)  (log  2tt-  log  £)  -  (3+5^-1-s)  log  m  =   E  tt* 
z  z  z  r=l  r 

(5.2.39) 

so  that  we  may  write 

<j/  (6)=-%(m-l)log(l-28)+  E  tt*  { — -  -  1}.  (5.2.40) 

n  r=l     (1-26) 
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If  r  has  the  chi-square  distribution  on-e  degrees  of  freedom 
then  its  cumulant  generating  function  is 

yr(9)  =  -  |  log  (1-29)  ,  (5.2.41) 

we  see  that  equation  (5.2.37)  has  the  same  form  with 
e=(m-l)  degrees  of  freedom  and  an  additional  sum  which  may 

be  called  the  remainder.   This  remainder  may  be  reduced  by 

* 

choosing  p=pQ  so  that  t^  =  0  and  the  approximation  is  improved. 

* 
For  tt   =  0  we  must  have 

m 
B  (6)  =  m  I  B,(a.)  (5.2.42) 


or 


Recall  that 


j=l 

m 

l2-B+~=m   Z    (a^_a  +1  ).  (5.2.43) 

D   j=l   H  J    b 


3=2<n(v-p)  -hh  (m+1) 


and 


a.=h{v-p)-kj    :  l^jsm  , 
letting  6  =  h(v-p)    then 

3=  mS-^mdn+l)  (5.2.44) 

and 

ctj  =  6-Jsj     :    l^jsan    .  (5.2.45) 

Substituting  (5.2.44)  and  (5.2.4  5)  into  equation  (5.3.43) 

gives 

[m6-Tm(m+l)  ]  2-m5+^m(m+l) +^  = 
4  4         6 

m      12    11 
m  I    [(6-±;,)*-6+±j+±]  .  (5.2.46) 

j-1     ^       ^   b 


61 


Expanding  the  left  hand  side  of  equation  (5.2.46)  one  has 

m262-|m2(m+l)5+-^2(m+l)2-m5+^m(m+l)+|        (5.2.47) 
and  expanding  and  summing  the  right  hand  side  of  equation 


(5.2.46)    one   has 

.2r2      1^.2  ,.„_,_.,  Wj_    1    m2  ,„^  ^  omJ.n.m'/;    x.±m^  im+i  \  +— 


-—in*"  (m+1)  5+  —  m    (m+1)  (2m+l)-m   6    +^m    (m+l)+g 


Collecting  like  terms  we  find  (5.2.48) 

=  (m+1) (m2+12m+8)  (5.2.49) 

4  8m 

hence 

(m+l)(m2+12m+8)  (5.2.50) 

M  K0  24m 

We  find  then  that 


_  x. 


<j>n(e)  =     (l-26)~i2(m"1)  exp{  I   Trr[(l-28)"r  -1]} 

r=2  (5.2.51) 

P=Pn 


where 


*■        (m+1)  f3m  -36m  -583m  -336m  +160m  -192] 

2  6912p> 

P=Pq  °  (5.2.52) 

Thus  the  cumulative  distribution  function  of  n=-P0logA1  is 

found  from 

Pr{-p0logXr£X}=  Pr{X^m_1)  *U  + 

ff2(Pr{X?m+3)^}-  Pr{x(m-DSX})+  ^^ 
„3  _3   _4    (5.2.53) 

with  R' (p   )  a  remainder  involving  terms  xn  p_  , p_  ,  •--  . 

Asymptotically  we  have  that  the  distribution  of 

m        d. 
-pQ  logX1  =  -p0_E  log  ,         \  (5.2.54) 

1~±    V  -   E  d. 
mj  =  l  ^ 

tends  to  that  of  a  chi-square  variate  with  (m-1)  degrees  of 
freedom. 
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5.3   The  Distribution  of  T  a  Function  of  {g.  .  :0:5i<i<m}. 

In  section  5  of  Chapter  II  we  derived  the  distribution 
of  {g.  .:0sj<i^m}  conditional  on  {d.:0s:J£m}  when  V  was 
arbitrary  and  when  V£V.   In  particular,  when  VeV  and  m=4 
we  find  from  equation  (2.5.15) 

f(910'g20'g30'g40'g21'g31'g41'g32'g42'g43,d0'dl'd2'd3^  = 

riTD  exp  {~  r^[(gio-ai)2+(g2o-a2gio)2 

(2tto  d.  )         2a 


+  (g30-a3g20)2  +  (g40_a4g30)2]} 


r    dlr/_     ..  ,2,,.        _    ,2,,_     ..       ,2,, 


5-  exp  { _[(g   -a  )  +(g   -a  g   )  +(g   _ag   )  ]} 


_     3   "^  L  ,  2LVd21  "2'   v^31  "3^21'   v^41  "4*31' 
(2™  d~X)2        2C 


dn         2  2  , 

exp  {-  — ^[(g__-a_)  +  (g   -a.g^„)  ]} 


lOTtn2*     -1^  On2  32  3  42  4      32 

(2tto  d-^   )         2a 


^imVrV****  {"  7J  (g43"a4)2} 

(2^a  d3  )        2a  0sj<is4 


-°°<gij<co  '   (5.3.1) 


If  in  (5.3.1)  we  replace  (a  ,a  ,a  ,ct.)  by  their  estimators 
*g10'g21'g32'g43)  We  obtain  the  statistics  ^g20"g2lglo' ' 

(g30-g32g20} '  (g40~g43g30}'  (g3l"g32g21) '  (g4l"g43g3l) ' 

(g  .p-g  .  _.g^_  )  }  .   These  statistics  indicate  a  departure  from 

the  model.   We  shall  investigate  the  distribution  of  a  function 

of  these  statistics  .   No  compact  expressions  have  been  found 
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for  general  m  so  we  present  here  the  case  for  m=4.   The 

case  for  general  m  follows  directly  from  the  case  m=4.   In 

what  follows  we  shall  use  conditional  distributions  and  for 

convenience  shall  let  #  denote  a  set  of  fixed  variables. 

Consider  first  the  statistics  Ug40-g43g30) /  (g4i~g43g3i) 

(g  .„-g..,g^„)  }  .   We  shall  determine  their  joint  distribution 
42   4332 

conditional  on  U   =  {dQ ,d,,d2,d 3,d. ,g3Q ,g31 »g32^ •   We  have 
from  (5.3.1)  that  g  Q ,g   , g   ,  and  g    are  normally 
distributed  conditional  on  M  so  we  need  to  find  the  moments 

of   Ug40-g43g30),  ^l"^^^ '  (g42~g43g32} }  conditional 
on  U to  determine  their  joint  distribution. 
We  have  that 

£(g40"g43g30) I  *=  £[e(g40"g43g30)] '  *  '        (5.3.2) 
where  the  inner  expectation  is  with  respect  to  g.n  and  the 

outer  expectation  is  with  respect  to  g4T-   From  (5.3.1)  we 

obtain  £  (g.Q  |  H)  by  inspection  to  be  a^g^n/  hence 

£(g40~g43g30) I*  =  e[(a4~g43)g30] '*  (5.3.3) 

=  0  (5.3.4) 

since  by  inspection  of  (5.3.1)  £  (g43) |  W=  a   • 
In  the  same  way 

6  (g41~g43g31)  ■  *  =  £  [S  (g4l"g43g31)  ]  '  * 

-c  t  (a4-g43)931]!a 

-  0  ,  (5.3.5) 

and 

£{g42"g43g32) I    *  =  e  [£(g42"g43g32)] '    M 
=  £  [(a4-943)932] I    » 
=    0    .  (5.3.6) 


64 


The  second  moments  are  handled  identically  and  we  find 

e  ^40-g43g30) 2 I M  =  e  [fi (g40"2g40g43g30+g43g30) ]  ' * 

2 
=  £fd^  +a4g30-2a4g43g30fg43g30]|}i 

_    ra2         2    2      _    2    2    ^    2    ,a2  2.  . 

-  Cd-  +  a4g30-2a4g30+g30(d^  +    a4)] 

2  2 

=  a^  +  g30d7'  <5-3-7> 

£(g4rg43g31)2|«  =    e[£(g4r2g41g43g31+g43g31)]  >  * 

-  „  r°2    i      2    2       o  2,22,1 

~   ^dj  +   a4g3l"2a4g43g31+g43g31] '* 

-  °2    l      22      o    2    2    A    2     ,a2  2, 
~   d^  +    a4g3r2a4g31+g31(d^  +    V 

a2   +      2a2 
=  T"  +   g^^    '  (5.3.8) 


dl        ^31d3 


and 


£ (g42~g43g32) 2 I »  =   £ [fi (g42-2g42g43g32+g43g32} ] ' * 

2 
=   e[d^  +a4g32-2a4g43g32+g43g32]|* 

_   a2    ^      2    2      .    2    2    .    2     .a2    J       2, 
~   dT  +    a4g32_2a4g32+g32(d7  +    Q4) 

a2   +      2a2 
=   d^+g32cLj  •  <5-3'9> 

The   cross    product   terms   are   handled   similarly   and  we   have    that 

e(g40"g43g30}  (g4rg43g31}iW   =   £  {£  [£  (g40g41_g40g43g3rg41g43g30 

+g43g30g31)]}'}i'  (5.3.10) 
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where  the  inner,  middle,  and  outer  expectations  are  with 
respect  to  ^aq'^at'    and  ^43'  respectively.   Continuing  we 


find 


£(g40_g43g30){g41_g43g31)  !  *   =   £{fi  [a4g30g4ra4g30g43g3l'g41g43g30 

f43g30g3i: 


+g^g,ng,J>l« 


=   £^4g30g3ra4g30g43g31_a4g43g30g31 


+g43g30g31}'H 

2 
m   a4g30g31-2a4g30g31+(d^+a4)g30g31 

2 

=  Ij  g30g31'  (5.3.11) 


e(g4(Tg43g3G) (g42"g43g32) I *=  £ {£ [& (g40g42_g40g43g32-g42g43g30+ 

+g43g30g32)]}'W 
=  e{e[a4g30g42-a4g30g43g32-g42g43g30 

+g43g30g32]}l* 
=e{a4g30g32_a4g30g32"a4g43g30g32 
+g43g30g32}lW 

=  a4g30g32-a4g30g32_aJg30g32 
a2     2 

+  (^  +  Vg3og32 

2 
=  17  g30g32  '  (5.3.12) 
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and 


e(g41-g43g31) (g42"g43g32)  '  *  =  £{e[e(g41g42_g41g43g32~g42g43g31 

+g43g31g32)]}lH 


=  £{£  ^4g31g42-a4g31g43g32-g42g43g31 


9 

+g43g31g32^  * I W 


=  £^4g3lg32"a4g31g43g32-a4g32g43g31 


+g43g31g32}'  * 


=  a4g31g32_a4g31g32~Ct4g31g32  + 
2   2 
+(d7'a4)g31g32 


Hence  we  find 


'40  g43g30 


,g42"g43g32 


d^g31g32 


(5.3.13) 


'4l"g43g31 


U     *   N3(y,  a  V3) 


(5.3.14) 


where 


(5.3.15) 


and 
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d0    d3  } 


g30g31d31 


g30g32d3 


g30g31d31 


2 
dl    d3 


g31g32d31 


g30g32d3" 


g31g32d3' 


1    y32 
d2    d3 


(5.3.16) 


.r-1. 


It  is  well  known  that  if  x: (nxl)  ^N  (0_,I)  then  x'E   xo-x  (0)  . 
Therefore  it  follows  that 


r4  = 


g40"g43g30 
g41-g43g31 


[40"g43g30\ 
V3X  (  g4rg43g31  )  I*^2X^(0), 


•y42  ^43^32/  \^42  ^43^32/  (5.3.17) 

where  the  subscript  on  r  equals  the  first  subscript  on  g. 
Since  the  distribution  above  is  functionally  independent  of 
the  variables  in  U   we  have  the  unconditional  distribution 
also,  that  is, 

(5.3.18) 


V^X^O)  . 


Putting  U    =  {dQ ,d1 ,d2 ,d3 ,d4 ,g2Q ,g21 }  ,  we  now  find  the 
joint  distribution  of  Ug30-g32g20)  ,  (^-^^l*  }  ' 
Proceeding  as  before 

e(g^n-g^gonH  »  =  e.l&(9r>n-9??q<,n)}\  a 


30  ^32^20 


=  e  [(a3_<?32)g2o]  I  w 


=  o  , 


(5.3.19) 


and 


e(g31~g32g21)  I  *  =  S  [£  (g3l"g32g21)  ]  '  M 
=  e   I (a3_g32)g21]  I  * 


=  0 


(5.3.20) 
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The  second  moments  are 

£(g30-932g20)2lti      =   £[£(g30-2g30g32g20+g32g203]|* 

2 

=  £[d^  +a3g20-2a3g32g20+g32g20]lH 

o2  _,_  2  2   0  2  2  ^.a2  ^  2,  2 
=  d~  +a3g20-2a3g20+(d^  +a3)g20 

2     2 
■     =a^+a-g20  '  (5'3-213 

and 

e(g31-g32g21)2U    =  etc  (g3i-2g3ig32g2i+g32g2i}  ]  ' * 

2 

=   £[o7  +a3g2r2a3g32g21+g32g21]  l» 

o2    .    2    2      ,    22    ^,a2    ^    2,     2 
=   d~  +a3g21-2a3g21+(d^  +a3)g21 

S  5^  +  ^21    '  <5'3'22> 

The   cross   product   term   is 

£(g30*g32g20)(g3l"g32g21)|}i    =   fi{&  [e (g30g31-g30g32g21_g3Ig31g20 

+g32g20g21)]* 


=  £<£  ta3g20g3ra3g20g32g2rg31g32g20 

2 

f32g20g21- 


+googongoi]>|ji 


=e{a3g20g21-a3g20g32g21_a3g21g32g20 
+g32g20g21]}ly 


=   a3g20g2ra3g20g21+a3g20g21 

±,°2    J.    2s 

+  (gj  +<Vg20g21 

a2 
=   d-g20g21       *  (5.3.23) 
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Hence  we  find  that 

I   g30"g32g20 
\  g31~g32g21 


w  a,  N2  (y,a  V2) 


(5.3.24) 


where 


y  = 


(5.3.25) 


and 


V2= 


Nov;  form 
/ 


d0    d2  } 


g20g21d21 


g30_g32g20 


f31  g32g21 


g20g21d2 


1    ^21 
[d,  d0  ' 


.-1 


f30"g32g20 


!31  y32y21 


(5.3.26) 


(5.3.27) 


where 


:3I  W  'v.  o2X22    (0) 


(5.3.28) 


By  the  same   arguments  as  before  we  have  that  the  distribution 
of  r-  is  functionally  independent  of  the  elements  of  #  and 


hence 


r3  *  o2xl(o: 


(5.3.29) 


unconditionally.   Moreover  r.  is  independent  of  r.  since  the 
distribution  of  r .  is  functionally  independent  of  the 
elements  of  r... 

Finally  we  consider  (g   -g   g   )  and  put 


H=  {d0,d1,d2,d3,d4,g10}  . 
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=  e  [(a2"g2I)g10]  '* 

=    0       ,  (5.3.30) 

and 

e  (g20_g21g10) 2\*    =  fi  fe  (g20_2g20g21g10+g21g10) ]  '» 

2 
=  £  [I^   +a2g10-2a2g21g?0+g21g10] I    * 

a2         2    2      -    2    2    ^,a2   ^    2.     2 
=   d^  +a2g10-2a2g10+(d^  +a2)g10 

-   ^2    +   jL?       _  (5.3.31) 


d0     -d  no 


Hence 


where 


and 


With 


then 


(g20"g21g10)  I    *^   N(Vi,a2V2)  (5.3.32) 


y   =   0  (5.3.33) 


1  y10 

V  =  —  +  (5.3.34) 

VI  dQ  dx     *  P    J    J4' 

<g20-g2lgl0>2  (5<3>35) 

(-  +  ^) 
ao    dl 


r2|  ti^a2X2(0)   .  (5.3.36) 

Since  the  distribution  of  r„  is  functionally  independent  of 
the  elements  of  tt  we  have,  unconditionally  that ,     r2  is  chi- 
square  with  one  degree  of  freedom.  Also  r»  is  independent  of 
r_  and  r.  since  their  distribution  is  functionally  independent 
of  the  elements  of  r_.   Since  the  three  statistics  are 
independent  we  may  add  them  to  get 
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R  =  (r2+r3+r4)  ^  o2Xg(0)  .  (5.3.37) 

We  note  that  the  distribution  of  R  depends  on  the  nuisance 

2 

parameter  a  ,  to  eliminate  this  parameter  we  consider 


T  =  15Z|L   „  F^_10(0)  .  (5.3.38) 

Since  R  is  independent  of  {d„  ,d,  ,d~  ,d-.  ,d.}   then  its 

2 

independent  of  o ^   and  T  is  the  ratio  of  two  independent  chi- 

square  variates  divided  by  their  degrees  of  freeom,  that  is, 
T  has  the  F  distribution. 

Before  extending  this  result  to  general  m  we  note  that 
the  dispersion  matrix  V_  of  equation  (5.3.16)  may  be  written 

V3  =  D_1(3)  +  l3Y3  (5.3.39) 

where 

D(3)  =  diag  (d^d^d^  (5.3.40) 

and 

13  =  ~k;    (g30'g31'532)'  *  (5.8.41) 

Since  V_  may  be  written  in  this  form  its  inverse  can  be 

obtained  from  the  Binomial  Inverse  Theorem,  found  in  Press 

[13] ,  which  states 

_1        -  _!         D(3)y ,x!d(3) 
(D   (3)+Y3Y3)    -  DO)-  l+l3D(3)Y3   '  (5'3'42) 

This  is  very  useful  in  the  actual  computing  of  the  statistic 
T.   It  follows  that  V~  is  the  same  form  and  can  be  written 

V2  =  D-1(2)  +  y2Y2  (5.3.43) 

where 


and 
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D(2)=diag(d0,d1)  (5.3.44) 


1-2=   ^|(920'521)   •  (5.3.45) 

In  general  we  can  form 

rj=£jvj^12j:2=3^  ,  (5.3.46) 

where 

^r[{gio-'gj-i,o^fj-i^  ^1-^-1,1^,^ 

(gJ,J-2-gj-l,j-2^j,j-l)],:2^£m  '       (5.3.47) 


and 


Vj_x=D  1(j-l)+Yj_  1l!J_1:2a6jaam  ,  (5.3.48) 


with 

D(j-l)=diag(d(),d1,  ,  .  .  ,d.  2)  :2sjSm  ,        (5.3.49) 


and 


Following  the  pattern  given  for  m=4 ,  when  VcV 

rj  *  °2Xj_1(0) :2^j^m  (5.3.51) 

and  they  are  mutually  independent  so  that 


m        2  2 
R  =  ,£/j  *  °  ^m(m-l)(0)'  (VeV>  (5.3.52) 


and  finally 

(RAm(m-l))  %   Jgm(m-l)  ~ 

1     a2        Fmv-3sm(m+l)  v0)  '  (VeV)  •  (5.3.53) 

No  attempt  has  been  made  to   find  the  distribution  of  T  when 
V/V,  but  a  computer  simulation  indicates,  as  we  would  expect, 
that  T  is  stochastically  larger  in  V^V  than  in  VeV. 
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5 . 4   Asymptotic  Performance  of  (~Pn log A  )  and  T . 

In  section  3  we  derived  the  asymptotic  distribution  of 
-pQlogX   and  showed  it  to  have  a  chi-square  distribution  with 
(m-1)  degrees  of  freedom.   In  that  section  we  also  showed 
that  the  distribution  of  -p  logX   is  independent  of 


m  m 


1  in  m 

S=—  Z  d .  or  equivalently  of   Ed..   In  section  4  we  showed 

mj=l  :  j-1  3 

that  R  has  a  chi-square  distribution  with  ^m(m-l)  degrees  of 

freedom,  independent  of  {drt,d,,...,d  }  and  hence  independent 

m  m 

of  -p  logX,  and  I   d . .   Since  R  is  independent  of   E  d .  it 

j-1  J  "  j=l  ^ 

is  independent  of  a£  and  hence  we  formed  T  equal  to  the  ratio 

2 

of  R  and  aA  divided  by  the  appropriate  constants  to  form  an 

F  distribution  with  ^(m-l)  degrees  of  freedom  in  the  numerator 

and  [mv-^mdn+l) ]  degrees  of  freedom  in  the  denominator.   Since 

2 

both  R  and  a ^   are  independent  of  -p  logX   then  so  is  T.   Now 

the  distribution  of  ^m(m-l)T  tends  to  that  of  a  chi-square 
variate  with  %ru(m-i)  degrees  of  freedom  as  v->-°°.   Since  T  and 
-pQlogX,  are  independent  we  have 

lim  {-P0logX1  +  ^(m-l)T}  *  X^m_±)   (m+2)  (0)  .    (5.4.1) 

It  has  been  shown  by  Wilks  [18]  that  under  certain 
regularity  conditions  -21ogX  will  be  asymptotically  distributed 
as  a  chi-square  with  I   degrees  of  freedom  under  the  null  hypoth- 
esis, where.  X, denotes  the  likelihood  ratio.   The  degrees  of 
freedom,  I,    may  be  computed  from  (I   -£  )  where  £,  equals  the 
number  of  parameters  estimated  under  the  alternative  hypothesis 
(H^)  and  £_  equals  the  number  of  parameters  estimated  under 
the  null  hypothesis  (HQ) .   For  the  problem  here  we  find  that 
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unde 


r  U^   V  is  arbitrary  and  we  must  estimate  all  %{m+l) (m+2)=£ 


different  parameters.   Under  H   there  are  only  (m+2)=£ 
unknown  parameters  to  estimate  and  hence 

£  =  £  -2 
*1   0 

=  %(m+2) (m-1)  .  (5.4.2) 

That  is,  the  asymptotic  distribution  of  -21ogX   and 
(-p0logA1+1'sm(m-l)T)  both  agree,  under  the  null  hypothesis. 
Hence  both  methods  are  asymptotically  equivalent  under  the 
null  hyporhesis. 

We  note  that  since  -p  logA   and  T  are  independent  that 
Fisher's  method  of  combining  independent  tests  may  be  used 
in  place  of  (-pQlogA  +%m(m-l) T) .   Fisher's  method  would  be 
especially  appropriate  if  the  sample  size  is  small. 


Chapter  VI 
COMPUTER  SIMULATIONS  AND  AN  APPLICATION 

6.  1   Introduction 

A  computer  simulation  of  the  generalized  autoregressive 
process  was  performed  thirty  times.   Each  simulation  had 
fifty  vector  observations  with  each  vector  observation  having 

six  measures  including  the  initial  measure.   Specific  values 

2        2 

were  given  (a, , a~ , a3 , a  ., a5)  ,  o~,  and  8'  and  they  were 

(0.80,  0.60,  0.50,  0.30,  0.20),  1.00,  and  4.00,  respectively. 

The  simulations  were  made  using  a  computer  program 
written  for  the  IBM  360  computer.   The  output  from  the  program 
includes 

(1)  the  data  used  in  the  analysis 

(2)  the  mean  for  each  time  period 

(3)  the  cross  product  matrix 

(4)  the  G  matrix 

(5)  the  diagonal  elements  of  D 

2        2 

(6)  the  starred  estimates  of  {a.:  l^i^m},  a  ,  and  8  • 

(7)  the  maximum  likelihood  estimates  of  {ex.:  l^i^m}, 

2        2 

a  ,  and  6  . 

(8)  the  values  of  -p  logA   and  T  used  in  testing  the 
adequacy  of  the  model. 

The  main  purpose  of  the  simulations  was  to  see  if  the 
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starred  estimators  would  perform  well.   In  keeping  with  this 

we  present  only  the  starred  and  maximum  likelihood  estimates 

2        2 

for  {a. :  isism},  a  ,  and  S  . 

An  application  of  the  theory  was  made  using  data  from 
a  drug  study  at  the  University  of  Florida.   This  study  was 
directed  by  Dr.  Arlan  L.  Rosenbloom.   Each  patient  was 
infused  with  glucose  and  observations  were  taken  on  the 
patient's  level  of  calcium  prior  to  infusion  and  at  90  minute 
intervals  thereafter  for  four  additional  observations. 

6 . 2   Computer  Simulation  Results 

Each  of  the  estimates  was   tested  against  its  true  value 
at  the  .05  level  of  significance.   On  the  average  then  we 
would  expect  to  reject  two  out  of  the  thirty  estimates  by 
chance  alone.   Those  that  were  significantly  different  from 
the  actual  value  are  listed  with  an  asterisk.  Counting  £he 
number  of  tests  that  were  accepted  as  a  measure  of  the 
estimator's  goodness  we  find  a*,  gave  28  acceptable  estimates 
out  of  30.   Since  a*,  is  identical  to  the  maximum  likelihood 
estimator,  a-,,  there  is  no  comparison.  a*2  gave  acceptable 
estimates  in  all  30  runs  while  cU  gave  28.   Estimating 
a^  =  .50,  the  starred  estimators  did  slightly  better  with  a*-, 
giving  28  acceptable  estimates  and  &->    giving  27.   ct*4  gave 
acceptable  estimates  in  all  runs  while  a.    gave  29.   The  last 
estimators,  a*5  and  6U ,  both  gave  28  acceptable  estimates. 
We  note  that  whenever  the  starred  estimate  was  rejected  so 
was  the  maximum  likelihood  estimate,  but  not  conversely. 
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2       2 

Tests  were  also  performed  on  the  estimates  of  a   and  3  • 

2      -2 
In  order  to  test  both  o ^    and  a   an  approximation  to  the 

distribution  of  chi-square  given  by  Wilson  and  Hilferty  [19] 

2    1/3 

was  used.   Their  result  is  that  (x  /v)     is  approximately 

normally  distributed  with  mean,  l-2/(9v),  and  variance,  2/(9v). 
This  result  and  a  discussion  are  also  given  in  Kendall  and 
Stuart  [11] .   The  results  of  the  tests  showed  that  the  starred 
estimator  gave  25  acceptable  estimates  while  the  maximum 
likelihood  estimator  gave  24.   Again  both  estimates  were 
rejected  on  the  same  runs,  with  one  exception,  when  the 
maximum  likelihood  estimate  was  too  high.   All  of  the  rejec- 
tions for  the  starred  estimates  were  caused  by  under  estimat- 
ing the  true  value . 

The  starred  estimates  and  maximum  likelihood  estimates 

2 

performed  equally  well  in  estimating  3  .   Both  gave 

acceptable  estimates  26  out  of  the  30  runs.   Of  the  four  in- 
correct estimates  both  were  high  on  three  and  low  on  one. 
They  both  gave  poor  estimates  on  the  same  runs. 

Overall  the  starred  estimators  performed  as  well  or 
better  than  the  maximum  likelihood  estimators.   As  can  be 
seen  by  the  means  and  standard  deviations  at  the  bottom  of 
Tables  1  through  3,  both  estimates  are  very  close  to  the  true 
»   value.   The  mean  of  the  maximum  likelihood  estimates  is  closer 
to  the  true  value  for  a~,    a.,    and  a,,  but  not  for  a-.,  a  ,  or  ^ 

Also  we  note  that  the  sample  standard  deviations  are  smaller 

2 

for  the  maximum  likelihood  estimates  except  for  B  •   None 

of  the  differences  seem  to  be  appreciable  in  any  case. 


Table  1 


ESTIMATES  OF  a         a         AND  a   FOR 
COMPUTER  SIMULATED  PROCESS 


Run 

a,=.80 

a2=.60 

a2=.60 

a=.50 

a    =.50 

Number 

a*l=ai 

a*2 

S2 

a*3 

S3 

I 

0.766 

0.484 

0.503 

0.415 

0.421 

2 

0.770 

0.691 

0.625 

0.274 

0.414 

3 

0.767 

0.639 

0.647 

0.623 

0.564 

4 

0.747 

0.427 

0.475 

0.620 

0.518 

5 

0.643* 

0.576 

0.611 

0.420 

0.356 

6 

0.842 

0.749 

0.722 

0.423 

0.553 

7 

0.723 

0.490 

0.650 

0.469 

0.452 

8 

0.839 

0.488 

0.661 

0.485 

0.521 

9 

0.906 

0.373 

0.492 

0.837* 

0.719* 

10 

0.902 

0.565 

0.514 

0.556 

0.540 

11 

0.892 

0.428 

0.554 

0.851* 

0.691* 

12 

0.674 

0.730 

0.663 

0.422 

0.493 

13 

0.799 

0.503 

0.528 

0.389 

0.333 

14 

0.747 

0.754 

0.655 

0.775 

0.541 

15 

0.826 

0.725 

0.643 

0.579 

0.643 

16 

0.748 

0.696 

0.592 

0.277 

0.381 

17 

0.795 

0.591 

0.624 

0.411 

0.415 

18 

0.815 

0.634 

0.626 

0.389 

0.508 

19 

0.861 

0.546 

0.659 

0.650 

0.57  3 

20 

0.848 

0.506 

0.557 

0.559 

0.398 

21 

0.810 

0.458 

0.532 

0.258 

0.335 

22 

0.722 

0.702 

0.636 

0.375 

0.368 

23 

0.746 

0.861 

0.667 

0.610 

0.709 

24 

0.747 

0.819 

0.571 

0.522 

0.603 

25 

0.826 

0.456 

0.575 

0.663 

0.632 

26 

0.927 

0.695 

0.784* 

0.255 

0.429 

27 

0.741 

0.459 

0.514 

0.529 

0.600 

28 

0.652* 

0.601 

0.566 

0.607 

0.642 

29 

0.719 

0.596 

0.683 

0.508 

0.510 

30 

0.706 

0.352 

0.435* 

0.568 

0.521 

0.784 


0.586 


0.599 


0.511 


0.513 


Mean 
Standard 

Deviation     0.074       0.134       0.079       0.158      0.112 
*indicates  estimate  is  significantly  different  from  the  true 
value,  at  the  .05  level  of  significance. 
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Table  2 
ESTIMATES  OF  a .  AND  a _  FOR 
COMPUTER  SIMULATED  PROCESSES 
Run        a4=-30      a4=.30      a5=.20      a5=.20 
Number      a^  .         a .  a . _         ft_ 


1 

0.360 

0.292 

-0.032 

0.060 

2 

0.353 

0.358 

-0.077 

0.090 

3 

0.241 

0.282 

-0.042 

-0.013 

4 

0.257 

0,256 

-0.156* 

-0.071* 

5 

0.366 

0.449 

0.334 

0.204 

6 

0.195 

0.202 

0.105 

0.109 

7 

0.010 

0.073 

-0.050 

-0.048 

8 

0.223 

0.267 

0.080 

0.076 

9 

0.133 

0.234 

0.347 

0.338 

10 

0.312 

0.351 

0.268 

0.300 

11 

0.352 

0.264 

0.209 

0.212 

12 

0.063 

0.070* 

0.053 

0.093 

13 

0.461 

0.383 

0.119 

0.175 

14 

0.269 

0.225 

0.096 

0.102 

15 

0.431 

0.295 

0.259 

0.224 

16 

0.364 

0.271 

0.225 

0.187 

17 

0U-J4 

0.352 

-0.095 

0.027 

18 

0.261 

0.147 

0.209 

0.196 

19 

0.099 

0.194 

0.040 

0.108 

20 

0.388 

0.425 

0.214 

0.199 

21 

0.070 

0.168 

0.022 

-0.026 

22 

0.425 

0.405 

0.301 

0.307 

23 

0.262 

0.349 

-0.106* 

-0.121* 

24 

0.180 

0.284 

0.236 

0.250 

25 

0.195 

0.170 

0.390 

0.363 

26 

0.264 

0.207 

0.330 

0.327 

27 

0.018 

0.102 

0.135 

0.110 

28 

0.392 

0.420 

0.381 

0.294 

29 

0.276 

0.260 

0.084 

0.099 

30 

0.168 

0.261 

-0.033 

0.029 

Mean  0.261       0.267       0.128       0.140 

Standard 

Deviation     0.129       0.101       0.160       0.129 
♦indicates  estimate  is  significantly  different  from 
the  true  value,  at  the  .05  level  of  significance. 
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Table  3 
ESTIMATES  OF  <?'  AND  B2  FOR 


COMPUTER  SIMULATED  PROCESSES 


Run 

a2=1.00 

o2=1.00 

B2=4.00 

82=4.00 

Number 

2 

~2 

a 

B^ 

~2 

B 

I 

0.842 

0.826 

4.982 

5.127 

2 

0.947 

0.928 

3.490 

3.594 

3 

0.805* 

0.788* 

6.847* 

7.055* 

4 

0.791* 

0.788* 

3.915 

3.965 

5 

0.920 

0.899 

4.256 

4.394 

6 

1.037 

1.015 

5.268 

5.432 

7 

0.993 

0.975 

4.070 

4.179 

8 

0.940 

0.925 

3.120 

3.278 

9 

1.180 

1.163* 

2.396* 

2.451* 

10 

0.818* 

0.811* 

5.888* 

5.991* 

11 

0.811* 

0.795* 

3.164 

3.255 

12 

0.969 

0.935 

4.235 

4.426 

13 

1.029 

1.003 

3.194 

3.305 

14 

0.925 

0.926 

4.096 

4.129 

15 

1.016 

0.992 

4.700 

4.853 

16 

1.028 

0.992 

3.227 

3.426 

17 

0'.963 

0.929 

4.122 

4.309 

18 

0.816* 

0.797* 

5.684* 

5.876* 

19 

1.011 

0.981 

3.191 

3.317 

20 

0.989 

0.993 

3.904 

3.922 

21 

1.082 

1.058 

3.458 

3.569 

22 

1.010 

0.988 

4.296 

4.429 

23 

0.840 

0.831 

3.004 

3.061 

24 

1.081 

1.053 

3.637 

3.768 

25 

1.141 

1.103 

4.413 

4.608 

26 

0.994 

0.971 

3.494 

3.608 

27 

0.996 

0.957 

4.023 

4.223 

28 

0.894 

0.861 

4.517 

4.733 

29 

0.986 

0.942 

4.209 

4.444 

30 

0.865 

0.867 

4.217 

4.243 

Mean  0.957       0.936       4.102       4.232 

Standard 

Deviation     0.102       0.096       0.944       0.971 
*indicates  estimate  is  significantly  different  from 
the  true  value,  at  the  .05  level  of  significance. 
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6 .  3   Application 

As  discussed  in  section  1,  patients  were  infused  with 
glucose  and  measurements  were  taken  on  their  calcium  level, 
prior  to  infusion  and  four  times  later  at  90-minute  periods. 
The  data  are  given  in  Table  4.   Inspecting  the  means  at  each 
period  given  at  the  bottom  of  Table  4  we  see  that, on  the 
average,  initially  the  calcium  reading  was  highest  and 
infusion  of  glucose  caused  it  to  drop  continually  until  the 
last  time  period  where  there  is  a  mild  increase  in  the  level 
of  calcium.   In  Table  5,  both  the  starred  estimates  and  the 
maximum  likelihood  estimates  are  given.   Both  estimators 
gave  similar  results  for  all  of  the  parameters  with  a.  having 
the  largest  value,  probably  reflecting  the  increase  in  the 
level  of  calcium  from  time  period  3  to  time  period  4 . 
Table  6  shows  the  standard  deviations  and  95%  confidence 
intervals  for  aA,  ,  a*„  ,  a*.. ,  and  a*..   The  confidence  intervals 
for  a*,,a#2»  anc^  a*r>   contain  zero  implying  the  parameters  do 
not  differ  significantly  from  zero.   This  could  have  been 
guessed  by  noting  the  relatively  small  change  in  the  mean. 
level  of  calcium  from  one  period  to  the  next.   Since  the  mean 
level  rose  in  the  last  period  the  parameter  a^.  is  large  and, 
as  noted  by  the  95%  confidence  interval,  is  significantly 
different  from  zero. 

In  testing  the  adequacy  of  the  model  we  found  -pf.logA1  = 
4.91  and  T  =  1.77.   Since  the  distribution  of  -p_logA1  is 
approximately  chi-square  with  3  degrees  of  freedom  we  compare 
the  calculated  value  against  the  tabulated  value  at  the  .05 
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2 
level  of  significance.   We  find  X-j   nt;   =7.81,  since  the 

calculated  value  is  less  than  this  we  accept  the  hypothesis 

of  sphericity.   The  distribution  of  T  is  F  with  6  degrees  of 

freedom  in  the  numerator  and  114  degrees  of  freedom  in  the 

denominator.   The  upper  5%  point  of  this  distribution  is 

F,  ..  .       __  =  2.18.   Since  the  tabulated  value  is  greater  than 
114 , . 05 

the  calculated  value  we  accept  the  adequacy  of  the  model. 

To  see  how  well  the  model  fits  the  data  we  randomly 
selected  patient  number  18  and  calculated  his  measurements 
for  the  four  time  periods  using  his  pervious  readings. 
Letting  yA  .,  denote  the  predicted  value  at  time  j  of  patient 
k  we  have  that 


where 


and 


Y*jk   =   y-j+xjk       :    ls3^4    ;    l£k£32    ,  (6.3.1) 


xjk    =   a*jxj_1/k:    1S3S4    *    l*k*32    ,  (6.3.2) 


xjk  =  Yjk"yj    :  lsjs:4  ;  lsks32  •  (6.3.3) 

Hence  we  may  write 

Y*jk  =   yj  ~a*jyj-l+a*jYj-l,k  :  lsjs4  ;  lsks32  •   (6.3.4) 
Given  that  patient  18  had  an  initial  reading  of  9.9,  the 
prediction  for  his  90-minute  reading  is 

y*l  18  =  9'15  "  -137(9-64)  +  J-37  (9-9) 
=  9..2D  . 
Similarly  for  the  rest  of  the  readings  we  find  that 
v*2,18  =  9.02  , 

y*3,18  =  9'06  ' 
and        y*4'18  =9.29  . 


Table  4 

LEVEL  OF  CALCIUM  IN  GRAMS  PER  LITER  IN 

PATIENTS  INFUSED  WITH  GLUCOSE 
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Patient 

Initial 

90 

180 

270 

360 

Number 

Period 

Minutes 

Minutes 

Minutes 

Minutes 

1 

9.1 

9.2 

8.9 

8.5 

8.3 

2 

10.0 

9.2 

"  9.5 

9.2 

8.4 

3 

10.1 

9.8 

10.1 

8.0 

9.7 

4 

10.0 

10.1 

9.1 

9.4 

9.5 

5 

9.7 

8.9 

9.1 

9.1 

9.2 

6 

9.5 

8.7 

9.1 

8.3 

8.6 

7 

9.5 

9.4 

9.3 

9.6 

9.3 

8 

10.1 

9.2 

9.3 

9.1 

8.7 

9 

9.6 

8.9 

9.3 

9.4 

8.9 

10 

9.1 

9.3 

9.0 

9.0 

9.0 

11 

9.6 

8.8 

8.9 

8.8 

9.4 

12 

9.3 

9.4 

9.3 

9.4 

9.7 

13 

10.2 

9.5 

9.8 

9.9 

9.8 

14 

9.2 

8.8 

9.4 

8.9 

8.2 

15 

9.6 

9.4 

8.9 

8.9 

9.0 

16 

10.1 

9.0 

9.1 

9.2 

9.1 

17 

9.4 

8.5 

8.5 

8.6 

8.7 

18 

.  9.9 

8.9 

9.5 

9.5 

9.8 

19 

10.4 

8.9 

9.4 

8.3 

8.1 

20 

9.0 

8.8 

8.5 

8.5 

8.4, 

21 

9.7 

9.6 

9.4 

8.4 

8.8 

22 

10.2 

8.1 

9.0 

8.9 

9.4 

23 

9.2 

10.3 

9.0 

8.7 

8.7  ' 

2  4 

9.7 

8.9 

9.1 

9.1 

9.2 

25 

9.0 

8.4 

8.1 

8.7 

8.7  . 

26 

9.4 

9.2 

9.2 

9.2 

9.0 

27 

9.4 

8.9 

8.8 

8.7 

9.0 

28 

10.1 

9.8 

9.1 

8.8 

9.0 

29 

9.8 

9.3 

9.5 

9.3 

9.6 

3  0 

9.6 

9.1 

8.4 

8.6 

8.5 

31 

9.5 

8.8 

8.5 

8.7 

9.2 

32 

9.4 

9.6 

9.4 

9.4 

9.5 

Mean 


9.64 


9.15 


9.11 


8.94 


9.01 


Table  5 
ESTIMATES  OF  THE  PARAMETERS 
FOR  THE  GLUCOSE  STUDY 
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Type  of 
Estimate 


Parameters 


Starred 

Estimate      0.137    0.334    0.305    0.506    0.173    0.843 

Maximum 

Likelihood    0.137    0.383    0.312    0.590    0.174    0.854 
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Comparing  these  to  the  actual  measurements  of  8.9,  9.5, 
and  9.8  we  see  that  the  model  gives  reasonable  predictions. 
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